diff --git a/bulk/repo-access-api.adoc b/bulk/repo-access-api.adoc index 6abe2db3..35ffed1a 100644 --- a/bulk/repo-access-api.adoc +++ b/bulk/repo-access-api.adoc @@ -49,7 +49,24 @@ We enforce conformance to{fn-org223} * No unresolvable containment/annotation ids * No unresolvable parent ids -For now, we do not support paging, as paging tree nodes is non-trivial{fn-org204} +For now, we do not support paging, as paging tree nodes is non-trivial.{fn-org204} + +== Deleted nodes +We don't have an explicit delete API.{fn-org221} + +We don't support [[orphan, orphans]]_orphans_ for now. +They are immediately deleted.{fn-org219} + +Deleted nodes don't exist anymore in the repository from the client's point of view. +They might still exist in other contexts (e.g. another branch), or physically within the repository for internal reasons (e.g. storage optimization, concurrent editing support). +A deleted node MUST NOT appear in any responses according to this API.{fn-org220} + +A repository MAY consider the deleted node's id to be _unused_, and thus allow to re-use it. +A repository also MAY disallow re-using previously deleted node ids. + +== Error cases for all SerializationChunks +=== Node with same id sent more than once + [[apis]] == APIs @@ -71,6 +88,9 @@ None. The partitions are sent as complete nodes.{fn-org202} Does NOT include <<{m3}.adoc#Language, Languages>> or partition children. +==== Error cases +None. + [[createPartitions, createPartitions]] === createPartitions: Create new partitions Creates new partitions in the repository.{fn-org216} @@ -78,8 +98,7 @@ Creates new partitions in the repository.{fn-org216} Each sent node is its own partition. Thus, we cannot send the contents (i.e. (indirect) annotations/containments) of a partition; We can send them in a later <> call. We also MUST NOT mention any annotation/containment node ids in the partition nodes, as they cannot be part of the same request, and we don't allow moving nodes in this operation. - -We MAY send properties and references #TODO correct?#{fn-org225} +We MAY send properties and references{fn-org225} Each partition node id MUST NOT exist in the repository, and the sending client MUST use node ids allocated to it via <>. @@ -90,6 +109,11 @@ Each partition node id MUST NOT exist in the repository, and the sending client ==== Result #TODO# +==== Error cases +===== Partition node id already exists +===== Partition node id not reserved for this client +===== Partition node lists contained or annotated nodes + [[deletePartitions, deletePartitions]] === deletePartitions: Delete partitions and all their contents Deletes all mentioned partitions, including all (transitive) annotations and children. @@ -105,6 +129,10 @@ All (transitive) annotations and children become <>. ==== Result #TODO# +==== Error cases +===== Node with that id is not a partition +===== Node with that id does not exist +===== Invalid node id [[retrieve, retrieve]] === retrieve: Get nodes from repository @@ -137,6 +165,11 @@ We need to omit the parameter if we don't want to limit the depth. {chunk} containing all nodes according to `nodeIds` and `depthLimit` parameters. Does NOT include the definition of <<{serialization}.adoc#UsedLanguage, UsedLanguages>>, only their <<{serialization}.adoc#MetaPointer, MetaPointers>>. +==== Error cases +===== Node with requested id does not exist +===== Invalid depthLimit +===== Invalid node id + ==== Example request [source, httprequest] ---- @@ -189,21 +222,6 @@ Also, we assume no knowledge of the metamodel. #TODO do we support changes to classifier? What about changes in metapointer.version? Migration use cases?#{fn-org69} - -[[orphan]] -.Orphans -An _orphan_ node is a node present in the repository that is not mentioned in any other node as containment or annotation, and is not a partition.{fn-org219} -We cannot create orphans explicitly. -A node becomes an orphan if it already exists in the repository, but the node's id is removed from its parent containment/annotation, -and the node is not moved to another parent. - -#TODO: What about references to orphans? Do they resolve?# - -A repository MAY immediately <> orphans, or keep them in a <<{trash}.adoc, trash>>. - -The repository MUST NOT update/change references to orphan nodes. -Rationale: We in general do support unresolved or unresolvable references. - .How to handle unknown ids? * If requested by this client via <>: Create new node @@ -226,46 +244,13 @@ Rationale: We in general do support unresolved or unresolvable references. // include::partitions.json[] // ---- +==== Error cases +===== Node id mentioned as annotation/child in more than one parent +===== Move would create loop in tree +===== Parent / child / annotation node id unknown +===== Parent doesn't match child/annotation +===== New node id not reserved for this client -[[delete, delete]] -=== delete: Delete nodes from repository -Deletes nodes from the repository.{fn-org221} - -Deleted nodes don't exist anymore in the repository from the client's point of view. -They might still exist in other contexts (e.g. another branch), or physically within the repository for internal reasons (e.g. storage optimization, concurrent editing support). -After this call succeeds, the deleted nodes MUST NOT appear in any responses according to this API.{fn-org220} - -If we delete a node, we implicitly remove it from its parent's containment/annotation. - -(Transitively) contained/annotation nodes of deleted nodes that are not explicitly mentioned in the call are deleted. - -After this call succeeds, a repository MAY consider the deleted node's id to be _unused_, and thus allow to re-use it. -A repository also MAY disallow re-using previously deleted node ids. - -The whole call fails, without any changes to the repository, if any of the provided node ids does not exist in the repository, or any of the provided node ids is a <<{m3}.adoc#predefined-builtins-keys, built-in id>>. - -==== Parameters -[[delete.nodeIds]] -`nodeIds`:: List of node ids we want to delete from the repository. - -==== Result -#TODO# - -==== Example request -[source, httprequest] ----- -DELETE /bulk/delete?nodes=["first-node-id","13123123","c2Vjb25kIG5vZGUgaWQ"] ----- - -[NOTE] -==== -link:https://www.rfc-editor.org/rfc/rfc9110.html#name-delete[RFC 9110: HTTP Semantics] states about the request body of a DELETE method: - -> An origin server SHOULD NOT rely on private agreements to receive content, since participants in HTTP communication are often unaware of intermediaries along the request chain. - -Thus, this example sends the node ids via URL query. - -==== [[ids, ids]] === ids: Get available ids @@ -298,6 +283,9 @@ It MAY return less than `count` ids. ==== Result List of ids guaranteed to be free. +==== Error cases +None. + ==== Example request [source, httprequest] ---- @@ -321,7 +309,7 @@ create:: <> for partitions, <> that sends a update:: <> that sends a node (both partitions and other nodes) with an _existing id_, including all its features (both updated and unchanged). -delete:: <> for partitions (including all descendants), <> for other nodes +delete:: <> for partitions (including all descendants), for others <> of the parent node without mentioning the deleted node. move:: Assume we want to move node `N` from its current parent `S` to its new parent `T`. + diff --git a/derived/completeness.adoc b/derived/completeness.adoc new file mode 100644 index 00000000..6408d083 --- /dev/null +++ b/derived/completeness.adoc @@ -0,0 +1,175 @@ += Completeness Scenario + +"complete" means "a processor has finished all its work _up to the point in time at which we asked it_" + +Same as in https://github.com/LionWeb-io/specification/issues/248#issuecomment-2079730360[#248] + +== Assumptions + +* processors use the same APIs as other clients to retrieve their base models (and any other nodes they might need). + +* There is a central authority `derivationBackend` within a repository that handles requests for derivations. + +* `derivationBackend` has a special API to ask processors for a _complete_ derivation. +(Note that this only contains the derived nodes from this processor -- `derivationBackend` aggregates the results of all processors that contribute to the same derivation). + +NOTE: Red activity lines for repository are updates. + +Yellow boxes are the incoming, unprocessed changes. + +[plantuml,scenario,svg] +---- +!pragma teoz true + +actor generator as gen +participant derivationBackend as db +participant ScopeProcessor as scope<> +participant ValidationProcessor as val<> +participant DomainValidator as dom<> +database repository as repo + +autonumber 1 + +repo <<-]: change1 +activate repo #red +autonumber 1 +repo ->> scope: change1 +activate scope +autonumber 1 +scope -> scope: queue +note over scope: inbox:\nchange1 +[-[hidden]>scope +deactivate scope + +autonumber 1 +repo ->> dom: change1 +deactivate repo +activate dom +autonumber 1 +dom -> dom: queue +note over dom: inbox:\nchange1 +[-[hidden]>dom +deactivate dom + +autonumber 1 +repo <<-]: change2 +activate repo #red +autonumber 1 +repo ->> scope: change2 +note over scope: inbox:\nchange1\nchange2 +autonumber 1 +repo ->> dom: change2 +note over dom: inbox:\nchange1\nchange2 +deactivate repo + +autonumber 1 +repo <<-]: change3 +activate repo #red +autonumber 1 +repo ->> scope: change3 +note over scope: inbox:\nchange1\nchange2\nchange3 +autonumber 1 +repo ->> dom: change3 +note over dom: inbox:\nchange1\nchange2\nchange3 +deactivate repo + +== global state: aa == + +autonumber 2 + +gen -> db: get validation\nderivation +activate db + autonumber 3 + db ->> scope + activate scope + + autonumber 3 + db ->> val + activate val + + autonumber 3 + db ->> dom + activate dom + + autonumber 13 + & dom -> repo ++: get persisted derived model + return + + autonumber 4 + scope -> scope: process change1 + note over scope: inbox:\nchange2\nchange3 + autonumber 10 + & val -> repo++: get original model + return + + + autonumber 5 + scope -> scope: process change2 + note over scope: inbox:\nchange3 + + + autonumber 14 + & dom -> dom: process change1\nno change + note over dom: inbox:\nchange2\nchange3 + autonumber 15 + dom -> dom ++: process change2 + autonumber stop + dom -> repo ++ #red: update persisted derived model + return + +== global state: ab == + + return + note over dom: inbox:\nchange3 + + autonumber 12 + val -> val: calculate + autonumber 12 + val -->> db: validations + deactivate val + + + autonumber 6 + repo <<-]: newChange + activate repo #red + autonumber 7 + repo ->> scope: newChange + note over scope: inbox:\nchange3\ndelayed inbox:\nnewChange + autonumber 16 + repo ->> dom: newChange + note over dom: inbox:\nchange3\ndelayed inbox:\nnewChange + deactivate repo + +== global state: bb == + + autonumber 7 + scope -> scope: process change3 + note over scope: inbox:\nempty\ndelayed inbox:\nnewChange + autonumber 8 + scope -->> db: validations + deactivate scope + note over scope: inbox:\nnewChange + + autonumber 17 + dom -> dom: process change3\nno change + note over dom: inbox:\nempty\ndelayed inbox:\nnewChange + autonumber 18 + dom -->> db: validations + deactivate dom + note over dom: inbox:\nnewChange + +autonumber 20 +gen <-- db +deactivate db + +activate scope +autonumber 9 +scope -> scope: process newChange +[-[hidden]>scope +deactivate scope + +activate dom +autonumber 19 +dom -> dom: process newChange\nno change +[-[hidden]>dom +deactivate dom +---- diff --git a/derived/derivation.adoc b/derived/derivation.adoc new file mode 100644 index 00000000..b4624e08 --- /dev/null +++ b/derived/derivation.adoc @@ -0,0 +1,368 @@ +include::../shared/issue-footnotes.adoc[] +:serialization: ../serialization/serialization +:m3: ../metametamodel/metametamodel +:bulk: ../bulk/repo-access-api +:chunk: <<{serialization}.adoc#SerializationChunk, SerializationChunk>> + += Derived Models +:toc: preamble +:toclevels: 2 + +Derived models are calculated from other (original or derived) models without direct human interaction. +They are usually some form of analysis result, such as one related to a type system. +Nodes in derived models are typically associated with an original node -- e.g., the type computed for an AST node. +The repository manages this association. +Derived models may be persisted or be recalculated on the fly. + +== Terminology +original node:: +A node CRUDed by users (mediated by tools). + +derived node:: +A node implementing `IDerived` that can be calculated from one or more _base_ nodes. ++ +NOTE: A derived node is still a node with all its capabilities. +Thus, a derived node MUST be (indirectly) contained by a partition. + +base node:: +A node that has _derived_ nodes. +The base node can be an _original_ or _derived_ node. + +ava:: +A <<{m3}.adoc#Language, Language>> that defines the requested contents of a _derived_ model. +The _derived nodes_ are instances of ava. ++ +Example: We request ava `com.example.validation`. +This language defines interface `Warning`. +Another language `org.compiler.validation` _dependsOn_ `com.example.validation`, +and defines classifiers `ParserWarning implements IDerived, Warning` and `LinkerWarning implements IDerived, Warning`. ++ +[plantuml, avaExample, svg] +---- +package "com.example.validation" as ex { + interface Warning +} + +package "org.compiler.validation" as comp { + class ParserWarning implements .IDerived, ex.Warning + class LinkerWarning implements .IDerived, ex.Warning +} + +comp .> ex: dependsOn +---- ++ +The derived model contains instances of `ParserWarning` and `LinkerWarning`. ++ +NOTE: The name "ava" is a sufficiently meaningless placeholder until we think of a good name. + +derivation:: +The _derived_ nodes referencing a _base_ node. +Can be specified by the _derived_ node's _ava_. + +original model:: +A model that cannot be (re-)computed from other models. ++ +It contains mainly _original_ nodes, but MAY contain _derived_ nodes. +Example: A processor uses the repository to store its derivation result. + +derived model:: +A model that can be calculated from _base_ models. +Its partition, and possibly many other nodes, are _derived_ nodes. ++ +However, a derived model MAY contain nodes that do not implement `IDerived`. +Example: A derived richtext description contains `Word` concepts. +`Word` is reused in many contexts, and does not implement `IDerived`. + +base model:: +A model that has _derived_ models. +A base model can be an _original_ or _derived_ model. + +== New builtin interface `IDerived` +[plantuml, iderived, svg] +---- +interface "builtins::Node" as Node + +interface IDerived + +IDerived "0..*" -> "1..*" Node: base +---- + +Each derived node references one or more base nodes. +Any (base) node can have none or more derived nodes. + +[horizontal,labelwidth=12] +QUESTION:: Should a derived node also be able to refer to features inside a node? Or do we leave that for specific classes implementing `IDerived`? +QUESTION:: The assumption is that derived nodes can have children that are not derived nodes. +Of course children could also be derived nodes. + +== Client use cases: +* "Which validation issues have been found on this partition?" + +* "What are the types of all nodes in this partition?" + +* "What's the type of this node I reached via reference, i.e. I don't know its partition?" + +* "Update me any time the validation of this node changes" + +* "Give me this partition including validation and typing info" + +* "Give me this partition FAST, I don't care about validation and typing info" + +== Client Bulk API +=== retrieveDerivation +Retrieves derivations of subtrees of listed node ids, according to listed languages. + +==== Parameters +[[retrieveDerivation.nodeIds]] +`nodeIds`:: List of node ids we want to retrieve derivations about from the repository. + +[[retrieveDerivation.languages]] +`languages`:: List of _avas_ to specify the kind of derivations. +Optional parameter, defaults to _all derivations_. +If present, MUST be a list of (language key, language version) pairs. + +[[retrieveDerivation.depthLimit]] +`depthLimit`:: Limit the depth of retrieved subtrees. +Optional parameter, defaults to _infinite_. +If present, MUST be an integer >= 0, with ++ +-- +* 0 meaning "return only the nodes with ids listed in `nodeIds` parameter", +* 1 meaning "return the nodes with id listed in the `nodeIds` parameter and their direct children/annotations", +* 2 meaning "return the nodes with id listed in the `nodeIds` parameter, their direct children/annotations, and the direct children/annotations of these", +* etc. +-- ++ +NOTE: There's no _magic value_ of `depthLimit` to express _infinite_ depth. +We need to omit the parameter if we don't want to limit the depth. + +==== Result +{chunk} containing all derived nodes according to `nodeIds`, `languages`, and `depthLimit` parameters. + +[horizontal,labelwidth=12] +QUESTION:: The `depth:Limit` describes the depth of the retrieved nodes (derived nodes may be full trees), the next line suggests that `depthLimit` applies to the node id's sent. Needs clarification. + +First, we find all _base_ nodes according to `nodeIds` and `depthLimit` parameter (see <<{bulk}.adoc#retrieve, Bulk retrieve>>). +Then, we find all _derivations_ according to `languages`. +The result contains all derivations and all their descendants. + +Does NOT include the base nodes mentioned in `nodeIds`, or their descendants. +Does NOT include the definition of <<{serialization}.adoc#UsedLanguage, UsedLanguages>>, only their <<{serialization}.adoc#MetaPointer, MetaPointers>>. + + +== Possible derivation backends +[horizontal,labelwidth=12] +QUESTION:: The repository that is referred to is this always the repository where the opriginal model is stored, or could oit be a different (i.e. derivation processor specific) repository? + +[[permanent-repo, permanently stored in repository]] +Permanently stored in repository:: +_Example: Information about import source._ +We don't want to use annotations, as it's a lot of data, and we rarely need it. +So we don't want to burden the original model with it. +We create this derived model once, store it in the repository, and only change it on re-import. +Thus, derived node ids are stable. ++ +NOTE: This implements option G in #13{fn-org13}. + +[[permanent-external, permanently stored externally]] +Permanently stored externally (with identity):: +_Example: Extended personnel data._ +We relate original model nodes to persons in active directory. +All external data has its own identity (a GUID). +We derive node ids from external identity. + +[[temp-repo, temporarily stored in repository]] +Temporarily stored in repository:: +_Example: Recalculation-expensive validation results._ +We run potentially expensive validators on model change, and store their result in the repository. +As soon as we re-execute a validator, we delete all previous results of that validator from the repository. +Thus, derived node ids of the same validation result are stable, but once we re-calculate it, the node id changes +(even if it's semantically the same validation result). + +[[internal, internally stored]] +Internally stored:: +_Example: Complex type calculation engine with in-memory representation of its results._ +We update in-memory type information on model change. +We build the derived model from in-memory representation. +We store derived node ids also in memory, so they are stable for the lifetime of the in-memory representation. + +[[live, live-calculated]] +Live-calculated:: +_Example: Simple programming language where the type is always explicitly mentioned._ +We never infer any type, just look up a few references. +Thus, we always calculate the type on-the-fly, and never persist it. +Derived node ids change on every request. + +== Backend implementation of `retrieveDerivation` +A processor can `register` or `unregister` itself for one or more _ava_. +Only <>, <>, and <> backends would register. + +On a call to `retrieveDerivation`, we forward the request to the registered processors. +We also check the repository for partitions that implement `IDerived` and are instances of a classifier of one of the _ava_ languages. + +=== Example +Assume a _base_ model with nodes `a`,`b`,`bb`,`c`. +`a` is a partition containing `b` and `c`. +`b` contains `bb`. + +[plantuml, retrieveDerivationExample, svg] +---- +hide empty members + +object a +object b +object bb +object c + +a *-- b +a *-- c +b *-- bb +---- + +.Languages + +* `ValidationLang`, defines +** `IViolation` +* `M2Validation` _dependsOn_ `ValidationLang`, defines +** `PropertyViolation` implements `IDerived`, `IViolation` +** `MultiplicityViolation` implements `IDerived`, `IViolation` +* `DomainValidation` _dependsOn_ `ValidationLang`, defines +** `DomainValidationPartition` implements `IDerived` +** `InvalidNameViolation` implements `IDerived`, `IViolation` +* `TypeLang`, defines +** `StringType` implements `IDerived` +** `IntType` implements `IDerived` +** `EnumType` implements `IDerived` +** `UnkonwnType` implements `IDerived` + +[plantuml, retrieveDerivationLanguages, svg] +---- +hide empty members + +'interface IDerived + +package ValidationLang { + interface IViolation +} + +package M2Validation { + class PropertyViolation implements ValidationLang.IViolation + ', .IDerived + PropertyViolation --> .M3.Property: property + class MultiplicityViolation implements ValidationLang.IViolation + ', .IDerived + MultiplicityViolation --> .M3.Feature: feature +} +'M2Validation .> ValidationLang: dependsOn + +package DomainValidation { + class DomainValidationPartition<> + 'implements .IDerived + class InvalidNameViolation implements ValidationLang.IViolation + ', .IDerived + { + message: string + } + DomainValidationPartition *-- InvalidNameViolation +} +'DomainValidation .> ValidationLang: dependsOn + +package TypeLang { + class StringType + class IntType + class EnumType + class UnknownType +} + +package M3 { + interface Feature + class Property implements Feature +} +---- + +[horizontal,labelwidth=12] +QUESTION:: Should the `PropertyValidation` also point to the Property instance in the M2 model being validated, and not just to the node that contains the property? + +.Available backends +* <> Domain validator providing `ValidationLang` _ava_ +* <> Typesystem calculator providing `TypeLang` _ava_ +* <> M2 validator providing `ValidationLang` _ava_ + +.Description +1. [registration] M2 validator and Typesystem calculator processors register themselves with their _ava_. +2. [prebuild] Domain validator processor creates its temporary _derived_ partition `x`, containing one node with id `xx`, and stores it to repository. +Domain validator retrieved these free node ids from repository. +3. Typesystem calculator calculates the types of all original nodes. +It requests free node ids `ff`, `fg`, `fh`, `fi` for the resulting derived nodes and stores them in its internal representation. +4. [model update] Typesystem calculator is notified that node `c` changed. +It recalculates the type, but doesn't succeed, resulting in an _unknown type_. +Typesystem calculator requests a free node id for the resulting derived node `UnknownType` and stores the node id in its internal representation. +5. [request] A client asks to retrieve a derivation for nodes `a`, `b`, `c` with infinite depth for _avas_ `ValidationLang` and `TypeLang`. +6. Backend asks Typesystem validator to provide `TypeLang` derivations. +7. Typesystem validator returns nodes `ff`, `fg`, `fh`, `gg` according to its internal representation. +Note that node `fi` is replaced by `gg` (from model update). +8. Backend asks M2 validator to provide `ValidationLang` derivations. +9. M2 validator replies with two new nodes `eef` and `csa`. +M2 validator got the node ids for the new nodes from repository. +10. Backend asks Repository for all fitting partitions, and retrieves each partition's contents. +11. Repository returns `x` and `xx`. +12. Backend concatenates all results and returns them to client. + +[plantuml, retrieveDerivationImpl, svg] +---- +actor client +participant "retrieveDerivation\nbackend" as backend +participant "Domain\nvalidator" as domval +participant "Typesystem\ncalculator" as typer +participant "M2 validator" as langval +participant repository + +== registration == +autonumber 1 +langval ->> backend: register(ValidationLang) +autonumber stop +typer ->> backend: register(TypeLang) + +== prebuild == +autonumber resume +domval<<-] ++ +autonumber stop + domval -> repository ++: ids(count=1) + return [x] + domval ->> repository: store([\n DomainValidationPartition:x()\n InvalidNameViolation:xx(base[b], message="no space name")\n]) +deactivate domval + +autonumber resume +typer<<-] ++ +autonumber stop + typer -> typer ++: typeAll() + typer -> repository ++: ids(count=4) + return [ff, fg, fh, fi] + deactivate typer +deactivate typer + +== model update == +autonumber resume +typer<<-] ++: modelChange(c) +autonumber stop + typer -> typer ++: type(c) + typer -> repository ++: ids(count=1) + return [gg] + deactivate typer +deactivate typer + +== request == +autonumber resume +client -> backend ++: retrieveDerivation(\n baseNodeIds=[a,b,c]\n languages=[ValidationLang, TypeLang]\n) + backend ->> typer ++: provideDerivation(baseNodeIds=[a,b,c]) + return [\n StringType:ff(base=[a])\n IntType:fg(base=[b])\n EnumType:fh(base=[bb,myEnum])\n UnknownType:gg(base=[c])\n] + backend ->> langval ++: provideDerivation(baseNodeIds=[a,b,c]) +autonumber stop + langval -> repository ++: ids(count=2) + return [eef, csa] +autonumber resume + return [\n PropertyValueViolation:eef(base=[a], property=x)\n MultiplicityViolation:csa(base=[bb], feature=age)\n] + backend ->> repository ++: listPartitions().where(\n [ValidationLang, TypeLang].contains(\n it.classifer.allSpecializedLanguages()\n )\n)\n.retrieve() + return [\n DomainValidationPartition:x()\n InvalidNameViolation:xx(base[b], message="no space name")\n] +return [\n StringType:ff)\n IntType:fg\n EnumType:fh\n UnknownType:gg\n PropertyValueViolation:eef\n MultiplicityViolation:csa\n DomainValidationPartition:x\n InvalidNameViolation:xx\n] +---- diff --git a/derived/processors.adoc b/derived/processors.adoc new file mode 100644 index 00000000..89c6f5e9 --- /dev/null +++ b/derived/processors.adoc @@ -0,0 +1,133 @@ +:serialization: ../serialization/serialization +:m3: ../metametamodel/metametamodel +:bulk: ../bulk/repo-access-api +:chunk: <<{serialization}.adoc#SerializationChunk, SerializationChunk>> + += Processors and Derived Models +:toc: preamble +:toclevels: 2 + +== Terminology + +repository:: +A _repository_ is the "world". +- A repository is the scope in which id's are unique. +- A repository is the scope in which referred nodes can be resolved. +- A repository is a collection of partitions. + +original node:: +A node CRUDed by users (mediated by tools). + +derived node:: +A node implementing `IDerived` that is attached (by reference) to one base node and +that can be calculated from its base node and (optionally) other nodes. + +calculated node:: A node that is (automatically) calculated from other nodes. +A derived node is a calculated node, but not vice versa. + +model:: +A _model_ is a subset of a repository. +- A model is a collection of partitions (from the same repository). + +original model:: +An _original model_ is a model where all nodes are original nodes. +- An original model cannot be (re-)computed from other models. + +derived model:: +A _derived model_ is a model that can be calculated from _base_ models. +- A derived model has no original nodes. +- In a derived model all nodes are created (automatically) by a processor. + It contains derived nodes and calculated nodes. + ++ +However, a derived model MAY contain nodes that do not implement `IDerived`. +Example: A derived richtext description contains `Word` concepts. +`Word` is reused in many contexts, and does not implement `IDerived`. +In this case `Word` is a calculated node, but *not* a derived node. + +NOTE: A repository always contains exactly one original model and any number of derived models. + +NOTE: As defined here, nodes that are calculated by a processor can never be children or annotations of an original node. + +base model:: +A model that has _derived_ models. +A base model can be an _original_ or a _derived_ model. + +processor:: +A _processor_ is a component that provides/calculates derived nodes for nodes in a base model. +- A processor is always directly associated with exactly one _base model_. +- A processor creates (derived and calculated) nodes in exactly one _derived model_. +- A processor may use nodes (derived, calculated or original) from any model to calculate it's derived model. + +NOTE: Note that multiple processors may use the same language in their derived models. +E.g. a _model validation_ processor and a _type errors_ processor and a _deadlock detection_ processor may all use the same _Findings_ language. +Because of this derived models cannot be identified by their used languages. + +== Getting a derived node +Assume a client has base nodes _BaseA_, _BaseB_ and _BaseC_. +To ask for a derived node the client needs: + +- The node id of BaseN. +- The identification of the processor/derived model. + +Given processors _Red_, _Yellow_ and _Blue_ (don't be afraid of them :-) the client can ask: + + Get derived node Blue for BaseA.id + +or + + Get derived node Orange for BaseA.id and BaseB.id + +Note that _Processor Blue_ and _Derived Model Blue_ are conceptually the same +from the point of view of the client. + +NOTE: In this approach the language(s) used by a derived model play no role at all. +The response chunk containing the derived nodes (and their children etc.) will contain `usedLanguages`. + +The picture below shows an actual instantiation of the above described structure. + +.Repository with models and processors +image::processors.png[width=100%] + + +== Client Confusion +In general, there are (possibly) derived models in a repository, therefore requesting all +partitions will result in getting both original partitions and all derived partitions. +In many cases this might not be what the client needs. + +- How can a client see the difference? +- How can we ensure a client does not start editing / changing derived models? + +*=> Do we need to formalize the notion of Models (see definition above) in the repository?* + +== Repository Confusion +Assuming that processors use the existing bulk or delta API to store/retrieve nodes. + +- How does the repository know which nodes/partitions are derived or calculated? + * The repository should be able to work without knowing the language(s), + therefore the repository cannot consult a language definition to know which nodes are derived. + * For calculated nodes this is even harder as there is no information in the language + to deduce whether a node is calculated. +- If the previous point is solved, how does the repository know to which processor they belong? + +== Processors +There is a number of questions that needs to be answered about processors: + +- A processor is related to exactly one base model, this relationship needs to be defined somewhere. + * As there is one original model in a repository, connecting a processor to an original base model is + identical to connecting / registering a processor to a repository. + * The relationship to the derived model of the processor is less easy as + there might be many derived models in a repository. +- A processor typically expects nodes from one or more specific languages. + * E.g a type processor needs to understand the nodes for which it is calculating the types, while a scoping processor needs to understand the scoping rules of the specific language(s). + * +- Processors are attached to a repository, because they need access to their base model and + the repository is the only place where this can be found. +- Who and when will a processor be started? + * We do not want the client to explicitly start (and/or stop) processors. + A client should simply ask for a derived node for a certain processor/derived model. + The processor may already be running, or it will be started. + Except for maybe the performance, this should be invisible to the client. +- Can a processor create multiple derived nodes for one base node? + * E.g. a type processor may not only calculate types, but also type errors for a node. + Do we want to allow this? diff --git a/derived/processors.png b/derived/processors.png new file mode 100644 index 00000000..b98e70bb Binary files /dev/null and b/derived/processors.png differ diff --git a/derived/processors.pptx b/derived/processors.pptx new file mode 100644 index 00000000..66065f6b Binary files /dev/null and b/derived/processors.pptx differ