Skip to content

IDA-FBK/FuS-KG

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CC BY-SA 4.0 Colab HTML Documentation

FuS-KG: A Multi-Modal Knowledge Graph Supporting Personalized Health

Functional Status Information (FSI) describes physical and mental wellness at the whole-person level. It includes information on activity performance, social role participation, and environmental and personal factors affecting a person’s well-being and quality of life. Collecting, integrating, and analyzing this multi- modal information spanning different domains is crucial for addressing the needs of an aging global population and providing effective care for individuals with chronic conditions, multi-morbidity, and disabilities. Multi-Modal Knowledge Graphs represent a suitable way for integrating this information in a complete and structured way, allowing for reasoning and building tailored coaching solutions that support individuals in their daily lives for healthy living. FuS-KG aims to play a central role concerning the design and development of middle-layer applications of explainable behavior change systems allowing: (i) modeling conceptual information representing individuals’ FSI and the use of the information to adapt the generation of explanatory and motivational messages to the individual; (ii) supporting interoperability among different systems which could share, for example, databases of motivational messages or explainability algorithms; and, (iii) managing privacy and ethical issues relating to user data.


FuSKG Building Pipeline


Ontology development

We developed FuS-KG by combining METHONTOLOGY and Modular Ontology Modeling (MOMo) methodologies, ensuring a systematic and modular lifecycle for building, maintaining, and evolving the knowledge graph. The process included:

  1. Specification

    Starting from research questions (RQs):

    • RQ1: Which is the set of conceptual domains covering the entire knowledge required to represent completely and effectively the FSI of a user, and how such domains can be integrated with the dynamicity of user-generated knowledge?
    • RQ2: How the whole elicited knowledge representing all the FSI domains should be structured to be reused by different solutions having different goals?
    • RQ3: Which is a suitable methodology to build and maintain a highly complex and huge size KG integrating knowledge representing complementary domains?

    and a literature review on FSI and related domains, we derived a set of high-level requirements (REQs) to guide the FuS-KG development:

    • REQ1: Conceptualize food domain at a granular level (nutrients to complex recipes).
    • REQ2: Provide activities with effort metrics to determine food requirements.
    • REQ3: Model barriers affecting activities, food intake, and adherence to guidelines.
    • REQ4: Include images, videos, and other modalities to support education and knowledge injection.
    • REQ5: Define a user model and link it to domain knowledge while meeting privacy and data requirements.
    • REQ6: Define guidelines for behavior intervention.
    • REQ7: Incorporate time aspects for activity steps, guideline adherence, and data tracking.
    • REQ8: Use modular design for scalability and ease of maintenance.

    each requirement is supported by a set of competency questions (CQs), which are fully detailed in the FUSKG-REQs&CQs file available in the repository.

  2. Knowledge Acquisition

    The acquisition of the knowledge necessary for building FuS-KG has been done in two step (i) collaborating with domain experts to model core entities (abstract classes and properties), starting from the HeLiS ontology to address key requirements; and (ii) identifying and analyzing unstructured and structured sources to populate FuS-KG.

    Data sources included:

    • Food domain: HeLiS-based models from Italian agricultural and epidemiological archives, plus four additional datasets: the USDA database, Recipe1M+, Tasty, and RecipeDB.
    • Activity domain: The Compendium of Physical Activities, providing a taxonomy and effort measures.
    • Barrier domain: The Supported Intensity Scale (SIS) manual, modeling barriers affecting a person’s functional status.

    From these sources, the modeling team extended the conceptual model by defining concepts and properties to cover the temporal and multi-modal information of FuS-KG.

    Note: The links to download the raw versions of each source can be found at this LINK.

  3. Conceptualization

    FuS-KG’s conceptualization involved two steps: first, gathering and refining terminology from the HeLiS ontology and unstructured sources; second, refining the conceptual model and selecting ontology design patterns (ODPs). Adopted patterns from the ODP catalog include logical ones like Tree and N-Ary Relation, alignment via Class Equivalence, and content patterns such as Parameter, Time Interval, Action, and Classification.

  4. Integration

    FuS-KG aligns its core concepts with the DOLCE foundational ontology and integrates external vocabularies to enhance interoperability. These include the Time Ontology (ProperInterval), DCAT (Resource), and AGROVOC for nutritional concepts, enabling linkage with the Linked Open Data (LOD) cloud. To allow ontology integration, we also investigated the presence of possible matches between the modules of the FuS-KG and the available ontologies in the identified domains of FSI. The table below summarizes the number of matches found per module and target ontology (see Matching folder for details)

    FuS-KG Module Target Ontology #Matches
    FoodFoodOn131
    OntoFood55
    RecipeRecipeKG5
    ActivitiesOPTImAL2
    PACO6
    DiseaseHPO56
    UserSOHO0
    FOAF1
  5. Implementation

    FuS-KG is fully represented in Turtle (TTL) format. To manage its size and promote knowledge reuse, we applied the MOMo methodology. MOMo offers flexible guidelines through a sequence of steps to define modular ontologies, allowing engineers to create domain-specific modules as needed.

  6. Evaluation

    FuS-KG was thoroughly analyzed using the OOPS! pitfall scanner to identify potential issues. Most pitfalls detected related to reused ontologies (e.g., HeLiS) and included unconnected elements, missing annotations, and undeclared inverse relationships. All issues in newly developed modules were resolved. Consistency checks with Pellet and HermiT reasoners found no errors. Each module was individually evaluated to ensure quality, and the full FuS-KG passed successfully, confirming it is consistent, error-free, and meets all requirements.

  7. Documentation

    To facilitate community access and reuse, we provide comprehensive HTML-DOCS of the FuS-KG ontology, generated using the Widoco documentation tool.

Inside FuS-KG

FuSKG Modules

The knowledge about different domains of FuS-KG is divided into distinct “knowledge modules”. Each module encapsulates information related to a specific domain, such as user, activities, and food. This modular approach facilitates easier management and scalability of the knowledge base. Furthermore, by adopting a domain-specific split, we ensure that each module can be developed and updated independently, promoting modularity and reusability. To achieve this, the modularization followed the principles of the MOMo methodology.

Note

For more information, please refer to the paper, the README.md in the diagrams folders, and the available HTML-DOCS

MMKG Construction & Materialization

Unstructured sources preprocessing

We cleaned recipe preparation steps from unstructured multi-modal sources by removing steps shorter than eight characters (e.g., “Easy”, “YUM!”), reducing steps from 441,911 to 429,811. After filtering duplicates across datasets (e.g., oven preheating instructions), the total steps further decreased to 364,735. Multi-modal data (images from Recipe1M+, videos from Tasty) were centralized in a shared drive for easy access. RecipeDB was also manually integrated to enhance coverage and diversity of the overall knowledge base.

Structured sources preprocessing

The USDA database, originally in Excel, was manually reformatted for compatibility with Protégé. A similar process was applied to physical activity data from the Compendium of Physical Activity, initially in PDF format.

MMKG materialization

Using Python scripts (CODE) with the rdflib library , we parsed TBox concepts and properties and linked entities from all sources (ingredients, nutrition, preparation steps, images, videos). The resulting ABox TTL files and generation scripts are available in the repository. Manual inspections ensured data integrity before publishing. The table below shows the KG metrics by module.

MODULE #Axioms #Classes #Object Properties #Data Properties #Individuals
CORE804400
FOOD1,573,17925116140,928
RECIPES110,644142155,257
DISEASE27,5193216,095
ACTIVITY10,15524221,229
BARRIER2,853401511139
MULTI-MODAL4,123,5981025768,251
TEMPORAL4,056,60710451,524,016
GUIDELINES862210
USER3091411200

MMKG access

All FuS-KG materials, including FusKG-Tbox, FusKG-Abox, and documentation are hosted on GitHub. Additionally, a Google Colab notebook (see below) enables interactive SPARQL queries based on CQs associated with the REQs.

🚀 Google Colab Notebook

We have implemented a Python Colab Notebook showing how to interact with FuS-KG modules by performing SPARQL queries based on the CQs of each REQ (FUSKG-REQs&CQs).

Click below to access the interactive notebook:

Open in Colab

⚠️ Important

To load and run the notebook correctly, please follow the instructions in the README.md.


Note

  • For more detailed information, please refer to the paper.
  • Two Turtle files showing how to model user’s health data within FuS-KG and the sequence of steps required to complete an activity (not cooking domain) with a corresponding resource available online, are available here (also used in the notebook).

Authors

License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

CC BY-SA 4.0

About

FuS-KG: A Multi-Modal Knowledge Graph Supporting Personalized Health

Resources

Stars

Watchers

Forks

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •