You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Editor: Christoph Fabianek, OwnYourData https://OwnYourData.eu, christoph@ownyourdata.eu, https://www.linkedin.com/in/fabianek/
@@ -97,7 +96,6 @@ SOyA is a data model authoring and publishing platform and also provides functio
97
96
<img src="res/overview.png" width="800"><br>
98
97
<em>Figure 1: Building blocks in SOyA</em>
99
98
</p>
100
-
</div>
101
99
102
100
103
101
## Terminology ## {#terminology}
@@ -114,7 +112,7 @@ This document uses the following terms as defined in external specifications and
114
112
*in RDF:* an RDF Class with one or more properties
115
113
116
114
: <dfn lt=DRIs>DRI</dfn>
117
-
:: A **D**ecentralized **R**esource **I**dentifier represents a content based address for a [=terms/structure]. Within SOyA Multihash [[MULTIHASH]] (default: `sha2-256`) is used for hashing a JSON object and Multibase [[MULTIBASE]] (default: `base58-btc`) for encoding the hash value.
115
+
:: A **D**ecentralized **R**esource **I**dentifier represents a content based address for a [=terms/structure=]. Within SOyA Multihash [[MULTIHASH]] (default: `sha2-256`) is used for hashing a JSON object and Multibase [[MULTIBASE]] (default: `base58-btc`) for encoding the hash value.
118
116
119
117
: <dfn>Instance</dfn>
120
118
:: is a data record (e.g. an data describing an employee) with a set of properties as defined in a [=terms/Base=] or [=terms/Structure=]<br>
@@ -161,7 +159,7 @@ The Semantic Overlay Architecture (SOyA) strategy is intended to simplify the en
161
159
<img src="res/concept.png" width="800"><br>
162
160
<em>Figure 2: Conceptual view of SOyA</em>
163
161
</p>
164
-
</div>
162
+
165
163
166
164
A **Knowledge Scientist** is responsible for designing the data model for certain applications. This person has considerable experience in data modeling, however, does not necessarily have the knowledge of Semantic Web technologies. Unlike Data Engineers, the focus is mainly on the data model and does not concern with populating the data model with data instances. The following requirements are identified:
167
165
@@ -194,7 +192,6 @@ The Semantic Overlay Architecture (SOyA) is built on the following core componen
<h2class="no-num no-toc no-ref heading settled" id="profile-and-date"><spanclass="content">Draft Community Group Report, <timeclass="dt-updated" datetime="2022-07-07">7 July 2022</time></span></h2>
609
+
<h2class="no-num no-toc no-ref heading settled" id="profile-and-date"><spanclass="content">Draft Community Group Report, <timeclass="dt-updated" datetime="2023-08-22">22 August 2023</time></span></h2>
<p>This specification is not a W3C Standard nor is it on the W3C Standards Track. Learn more about <ahref="https://www.w3.org/community/">W3C Community and Business Groups</a>. <ahref="https://github.com/OwnYourData/soya/issues">GitHub Issues</a> are preferred for discussion of this specification.</p>
708
+
<p>Sections of this document have also been submitted to <ahref="https://icodse.org/">ICoDSE 2023</a> as part of a paper.</p>
<p><strong>Decentralised:</strong> avoid any centralized components or addressing (i.e., use decentralized resource identifiers - <adata-link-type="dfn" href="#terms-dris" id="ref-for-terms-dris">DRIs</a> - where possible)</p>
<p>The growing importance of data over the last two decades has encourage organizations across all sectors to undergo transformations towards data-driven operations. For many years, these organizations used relational databases as their main data solutions. However, the exponential growth of data has exposed the limitations of relational databases, such as the costly adaptation of database schemas and applications in response to evolving application needs and the lack of support for interoperability and exchange between heterogeneous data sources.</p>
763
-
<p>RDF and the related semantic web technologies are appealing as a vendor neutral framework for using graph data as alternative solution to address the issues with relational databases. However, the perceived difficulty of use has made RDF and related semantic web technologies to be categorized as a niche technology. This is unfortunate because it restricts uptake and inhibit RDF from being viewed as a feasible choice for many use cases.</p>
764
-
<p>To address these challenges, Semantic Overlay Architecture (SOyA), is proposed as a lightweight, semantic-web based approach for data integration and exchange. At the core of this approach is the SOyA structure, a YAML-based data model for describing graph data that is RDF-compatible, which consists of one or more soya:<adata-link-type="dfn" href="#terms-base" id="ref-for-terms-base④">Base</a>, which represent RDF classes and their properties, and zero or more soya:<adata-link-type="dfn" href="#terms-overlay" id="ref-for-terms-overlay①">Overlay</a>, which provides additional information and context to soya:Base as well as processing definitions. Furthermore, to support developers in conducting the most common data processing for graph data, a number of predefined soya:Overlay are defined, e.g., such as soya:AnnotationOverlay for data model description and soya:ValidationOverlay for constraint checking.</p>
763
+
<p>The escalating significance of data in the past twenty years has propelled organizations in various sectors to shift towards data-centric operations. Traditionally, these entities relied on relational databases as their primary data repositories. Yet, the rapid surge in data volume has highlighted the drawbacks of such databases, notably their expensive adaptability requirements and the challenges in ensuring compatibility between different data sources.</p>
764
+
<p>RDF, along with associated semantic web technologies, stands out as a neutral platform, offering graph data as a solution to the aforementioned relational database problems. Nevertheless, the perceived complexities in utilizing RDF have relegated it to a niche status, limiting its widespread adoption and diminishing its viability for many applications.</p>
765
+
<p>To combat these issues, the Semantic Overlay Architecture (SOyA) has been introduced. This offers a nimble, semantic-web centric strategy for data amalgamation and exchange. At its heart lies the SOyA structure: a YAML-based data design that’s compatible with RDF. This structure includes one or more soya:<adata-link-type="dfn" href="#terms-base" id="ref-for-terms-base④">Base</a> entities, which denote RDF classes and attributes, and possibly several soya:<adata-link-type="dfn" href="#terms-overlay" id="ref-for-terms-overlay①">Overlay</a> elements that add depth and context to the soya:Base while also defining processing guidelines. Additionally, to aid developers with routine graph data operations, predefined soya:Overlay types are available, like soya:AnnotationOverlay for data model elaboration and soya:ValidationOverlay for checks and balances.</p>
765
766
<h3class="heading settled" data-level="1.4" id="lifecycle"><spanclass="secno">1.4. </span><spanclass="content">Lifecycle of RDF Data Engineering</span><aclass="self-link" href="#lifecycle"></a></h3>
766
-
<p>A generic lifecycle for construction and maintaining knowledge for knowledge graphs consists of four phases: (i) knowledge creation, (ii) knowledge hosting, (iii) knowledge curation, and (iv) knowledge deployment. The lifecycle is proposed to address two main challenges, which are (a) integrating data from heterogeneous sources in a scalable manner, and (b) to build a high-quality resource given the applications at hand. Complementary to this lifecycle three main roles can be identified in data-driven organizations: Data Engineers, who mainly focus on harnessing and collecting data, Knowledge Scientists, who aim to make reliable data, and Data Scientists, focusing on drawing value from data.</p>
767
-
<p>The Semantic Overlay Architecture (SOyA) approach aims to reduce the barrier of entries for non-semantic web experts in utilizing semantic web technologies for addressing their data interoperability and exchange needs. More precisely, SOyA aims to support stakeholders in conducting the steps in the Knowledge Graph lifecycle described above.</p>
767
+
<p>A standard process for constructing and sustaining knowledge in knowledge graphs is segmented into four stages: (i) knowledge creation, (ii) knowledge hosting, (iii) knowledge curation, and (iv) knowledge deployment. This process is designed to overcome two primary obstacles: firstly, merging data from diverse sources in an efficient manner, and secondly, crafting a superior-quality resource tailored to specific applications. Alongside this process, there are three key roles evident in data-driven entities: Data Engineers, who primarily gather and manage data; Knowledge Scientists, whose goal is to ensure data reliability; and Data Scientists, who extract insights from the data.</p>
768
+
<p>The Semantic Overlay Architecture (SOyA) strategy is intended to simplify the entry for those unfamiliar with semantic web expertise, enabling them to harness semantic web tools for their data compatibility and exchange requirements. Specifically, SOyA seeks to guide users through the stages of the Knowledge Graph process mentioned earlier.</p>
768
769
<palign="center"><imgsrc="res/concept.png" width="800"><br><em>Figure 2: Conceptual view of SOyA</em></p>
769
770
<p>A <strong>Knowledge Scientist</strong> is responsible for designing the data model for certain applications. This person has considerable experience in data modeling, however, does not necessarily have the knowledge of Semantic Web technologies. Unlike Data Engineers, the focus is mainly on the data model and does not concern with populating the data model with data instances. The following requirements are identified:</p>
0 commit comments