Fix typos

marinak-ebi · marinak-ebi · commit 60e78f070a12 · 2025-06-13T15:02:02.000+01:00
diff --git a/annotation_pipeline/README.md b/annotation_pipeline/README.md
@@ -8,13 +8,13 @@ At the IMPC, genetic effects are identified using three statistical analysis pla
 ## Continuous data
 Continuous data are typically analysed using a linear mixed model framework. These continuous measurements are particularly informative because the direction of change can be determined through the effect size.
 
-However, due to the complexity of the data, not all continuous variables can be analysed using this framework. In such cases, the IMPC often employs the Reference Range Plus (RR) method. Control data are first discretising into three categories: low, normal, and high. Mutant data points are then classified into one of these reference categories. Finally, a Fisher's Exact Test is applied to determine whether there is a statistically significant deviation from the normal category.
+However, due to the complexity of the data, not all continuous variables can be analysed using this framework. In such cases, the IMPC often employs the Reference Range Plus (RR) method. Control data are first discretised into three categories: low, normal, and high. Mutant data points are then classified into one of these reference categories. Finally, a Fisher's Exact Test is applied to determine whether there is a statistically significant deviation from the normal category.
 
 ## Categorical data 
 Categorical data in the IMPC encompasses a range of qualitative measurements and is analysed using Fisher’s Exact Test, as implemented in the R package OpenStats.
 
 # How IMPC Annotation Pipeline Works
-The `annotationChooser` function is processes statistical analysis results called statpacket. It determines calls based on significance levels. These calles are then mapped to Mammalian Phenotype (MP) ontology terms using a provided `mp_chooser_file`. Finally, it updates the input statpacket's JSON component with the identified MP terms. If no relevant annotation is found or the statistical result is not significant, it returns the original statpacket with no MP terms added.
+The `annotationChooser` function processes statistical analysis results called statpackets. It determines calls based on significance levels. These calls are then mapped to Mammalian Phenotype (MP) ontology terms using a provided `mp_chooser_file`. Finally, it updates the input statpacket's JSON component with the identified MP terms. If no relevant annotation is found or the statistical result is not significant, it returns the original statpacket with no MP terms added.
 
 The annotation pipeline requires a reference table that summarises the available MP terms for a given IMPC parameter. This reference can be retrieved from [IMPReSS](https://www.mousephenotype.org/impress/index).
 The ETL pipeline handles this by generating the `mp_chooser.json` file.
@@ -30,7 +30,7 @@ The ETL pipeline handles this by generating the `mp_chooser.json` file.
     - UNSPECIFIED
 
 In the `mp_chooser.json` file each MP term can have different levels:
-- Ontology term levels: ABNORMAL, INCREASE, DECREASE.
+- Ontology term levels: ABNORMAL, INCREASED, DECREASED.
 - Sex levels: FEMALE, MALE and UNSPECIFIED.
 
 MP term assignment logic can be seen below:
@@ -48,7 +48,7 @@ MP term assignment logic can be seen below:
     }
 }%%
 graph TD;
-    Start{Which method is used for the analysis?} --> |MM| MM[Prioritise INCREASE/DECREASE MP term] --> A
+    Start{Which method is used for the analysis?} --> |MM| MM[Prioritise INCREASED/DECREASED MP term, otherwise use ABNORMAL] --> A
     Start --> |FE or RR| FE_RR[Only use ABNORMAL MP term] --> A
 
     A{"Is FEMALE/MALE specific MP term available in the mp_chooser file?"}