SemTech

Aims and Scope

The exponential increase in scientific, technical, and legal data available on the Web, including research articles, patents, standards, and technical reports, has made their large-scale semantic processing, interlinking, and knowledge extraction a central challenge for the Web community. These data sources are heterogeneous, semi-structured, and domain-specific, containing complex elements such as text, tables, equations, and diagrams that make traditional data integration and analysis difficult. Yet, they hold immense potential for advancing knowledge discovery, open science, and evidence-based innovation. As the Web evolves into a vast ecosystem of human- and machine-generated content, there is a growing need to develop scalable AI models and semantic interoperable representations that transform this fragmented information into interconnected, machine-interpretable knowledge..

In this context, the SemTech 2026 workshop focuses on methods that combine Semantic Web technologies, Natural Language Processing, Large Language Models (LLMs), and other AI technologies to model knowledge across scientific, technical, and legal domains. The workshop invites research on knowledge graph creation, semantic annotation, LLM–KG hybrid reasoning, and trustworthy AI pipelines that enhance the reliability, interpretability, and reuse of Web data. This is particularly timely as the Web community seeks robust approaches to integrate symbolic and sub-symbolic methods for managing and understanding the growing body of domain-specific knowledge on the Web.

Workshop Topics

The workshop accepts contributions in all topics related to semantic web technologies and deep learning focused (but not limited) to:

Data Collection

Leveraging LLMs for generating scientific, technical, and legal data.
New tools and systems for capturing scientific, technical, and legal data such as scientific articles, patent publications, etc.
Procedures and tools for storing, sharing, and preserving data on the Web.
Collecting and sharing data sets such as benchmarks, etc.
Pipelines and protocols to capture peculiarities from Web data.
Employing Semantic Web Technologies to represent and preserve sensitive data in terms of ethics, privacy, security, trust, on the Web.

Novel AI technologies for scientific, technical, and legal data

Ontologies and annotation schema to model such data.
Annotation, linking and disambiguation of the data.
Knowledge graph construction.
LLMs to generate metadata, vocabularies, ontologies, and semantic models for specific data.

Applications for patents, scientific, technical and legal data by exploiting semantic technologies

Exploiting knowledge graphs to drive document similarity, question answering, search etc.
Semantic content-based retrieval.
Natural language processing techniques for classification, summarization, etc.
Exploratory search using semantic technologies on scientific, technical, and legal data.
Key enabling tools (also based on LLMs) for accessing and using data on the Web.
Applications based on Generative AI and LLMs.
Lessons learned or/and use cases both from academia and industry around semantic models and LLMs for data in specific domains.

Submission

There have been changes due to the conference policy

Formatting Requirements. Submissions must be written in English, in double-column format, and must adhere to the ACM template and format (also available in Overleaf ). Word users may use the Word Interim Template. The recommended setting for LaTeX is: \documentclass[sigconf, review]{acmart}. The papers must be submitted as PDF files to EasyChair

We will consider three different submission types:

Full Research Papers (6-8 pages maximum) , should be clearly placed with respect to the state of the art and state the contribution of the proposal in the domain of application, even if presenting preliminary results. In particular, research papers should describe the methodology in detail, experiments should be repeatable, and a comparison with the existing approaches in the literature is encouraged.
Replicability/Reproducibility papers (4 pages) should involve repeating prior experiments using the original source code and datasets to analyze existing methods and their limitations. Alternatively, authors may assess the robustness of previous work by applying the original code in new contexts, such as different domains or datasets.
Short Papers (4 pages), should describe significant novel work in progress. Compared to full papers, their contribution may be narrower in scope, be applied to a narrower set of application domains, or have weaker empirical support than that expected for a full paper. Submissions likely to generate discussions in new and emerging areas of legal data are encouraged.

Submissions should not exceed the indicated number of pages, including any diagrams and references.

The accepted papers will be available on the Workshop website. The proceedings shall be published in a CEUR-WS.org volume, which is free of charge for ALL authors. This publication is a "Diamond Open Access" service, meaning it is also free for all readers. The proceedings will be indexed on Google Scholar, DBLP, and Scopus as for the previous workshop editions.

CHANGE TO PROCEEDINGS PUBLICATION Due to conference policy, papers accepted by the workshop will be included in the Companion Proceedings of the Web Conference 2026 which are archived in the ACM Digital Library, subject to meeting the ACM open-access, formatting guidelines, and camera-ready timeline as provided and observed by the ACM Web Conference. See the section Important update on ACM's new open access publishing model for 2026 ACM Conferences! on the conference website.

Each submission will be reviewed by three independent reviewers on the basis of relevance for the workshop, novelty/originality, significance, technical quality and correctness, quality and clarity of presentation, quality of references and reproducibility. The review process will be single-blind.

Program

SemTech4STLD workshop will take place on June 29th, 2026 (online from Dubai).

Timing	Content
14:00 - 14:45 UTC+04:00	Keynote and Q&A on Knowledge Graphs: From Search to Industrial Intelligence Speaker: Dr. Evgeny Kharlamov Abstract: Graphs have powered every leap in machine intelligence, from PageRank to Google's Knowledge Graph to today's agentic search. This keynote traces that arc into the enterprise, showing how knowledge graphs integrate industrial data (BASF, Chanel, Airbus), why vector-only RAG falls short for closed domains, and how Graph RAG grounds LLMs in verifiable, connected facts. It closes by linking knowledge graphs to agentic AI, where models become active knowledge agents. Short Bio: Dr. Evgeny Kharlamov is a Senior Research Manager and Expert at the Bosch Center for AI and an Associate Professor at the University of Oslo, with earlier years at the University of Oxford. Ranked 3rd worldwide in Knowledge Engineering (AMiner), his work spans Knowledge Graphs, Neuro-Symbolic AI, and Agentic Systems, bringing the semantic technologies from research into production at industrial scale. He has authored more than 200 papers at venues including WWW, ISWC, and NeurIPS (h-index 43, 6,700+ citations) and is a Principal Investigator on several European research project. His current work grounds large-scale, agentic AI in knowledge-graph reasoning.
14:45 15:00 UTC+04:00	Virtual Coffee Break
15:00 17:00 UTC+04:00 Session I	Paper I: The LOPE Method: Improving Consistent Property Extraction for Scientific Knowledge Graphs Using LLMs, Sandra Schaftner and Martin Gaedke (15 min + 5 Q&A) - Slides Paper II: Saliency-Guided Embedding Alignment for Query-to-Document Legal Case Retrieval, Yu-Han Shi and Yao-Chung Fan (15 min + 5 Q&A) - Slides Paper III: Beyond the Rules: Understanding the Design Logic of Internet Standards, Jie Bian, Michael Welzl and Nikolay Arefev (15 min + 5 Q&A) - Slides Virtual Coffee Break (15 minutes) Paper IV: The Atomic Instruction Gap: Instruction-Tuned LLMs Struggle with Simple, Self-Contained Directives, Henry Lim and Kwan Hui Lim (15 min + 5 Q&A) - Slides Paper V: ORKG Properties Ontology Consolidated: LLM-Driven Refinement of Crowdsourced Knowledge for Machine-Actionability, Sandra Schaftner and Martin Gaedke (15 min + 5 Q&A) - Slides Paper VI: Reasoning-Search-Augmented Large Language Models: A Survey and Taxonomy, Biswas Poudel, Nilson Chapagain, Amit Kumar and Xianshun Jiang (12 min + 5 Q&A) Paper VII: A Position Paper on Domain-Adaptive Text Classification and Abstractive Summarization using Semantic Enrichment and Transformer Models, Emmanuel Iko-Ojo Simon, Blessing Emedolu and Joshua Angyu (8 min + 5 Q&A)
	Closing remarks

Committees

Workshop Chairs

Rima Dessi' - Higher College of Technologies (United Arab Emirates)
Jeenu Joy - FIZ-Karlsruhe (Germany)
Danilo Dessi' - University of Sharjah (United Arab Emirates)
Francesco Osborne - Knowledge Media Institute - The Open University (United Kingdom)
Hidir Aras - FIZ-Karlsruhe (Germany)

Program Committee

Ahmad Alrifai - FIZ Karlsruhe, KIT-AIFB (Germany)
Rubén Alonso - R2M Solution Srl (Italy)
Simone Angioni - ISTI CNR (Italy)
Nana Yaw Asabere - Accra Technical University (Ghana)
Miriam Baglioni - ISTI - CNR (Italy)
Davide Buscaldi - LIPN, Université Paris 13, Sorbonne Paris Cité (France)
Leyla Jael Castro - ZB MED Information Centre for Life Sciences (Germany)
Serafeim Chatzopoulos - Athena Research Center (Greece)
Mathieu D'Aquin - LORIA, University of Lorraine (France)
Lu Gan - Heinrich-Heine University Düsseldorf; GESIS (Germany)
Susmita Gangopadhyay - GESIS - Leibniz Institute for the Social Sciences (Germany)
Alireza Javadian Sabet - University of Pittsburgh (USA)
Andrea Mannocci - CNR-ISTI (Italy)
Philipp Mayr - GESIS (Germany)
Giacomo Medda - University of Cagliari (Italy)
Allard Oelen - TIB - Leibniz Information Centre for Science and Technology (Germany)
Sarah Rajtmajer - The Pennsylvania State University (USA)
Sabine Wehnert - Ruhr University Bochum (Germany)

SemTech 2026
4th International Workshop on AI and Semantic Technologies
for the Scientific, Technical, and Legal Web

held at The Web Conference 2026

Dubai

(Photo: Getty Images)

Aims and Scope

Abstract deadline

January 5th, 2026

Paper deadline

January 12th, 2026

Notifications

January 25th, 2026

Camera-ready Paper

February 2nd, 2026