The rapid growth of online available scientific, technical, and legal data such as patents, reports, articles, etc. has made the large-scale analysis and processing of such data a crucial task. Today, scientists, patent experts, inventors, and other information professionals (e.g., information scientists, lawyers, etc.) contribute to this data every day by publishing articles, writing technical reports, or patent applications.
It is a challenging task to process, analyze, and explore documents due to their length, the use of domain-specific vocabulary, and the complexity introduced by targeting various scientific fields and domains. Documents are semi-structured and cover unstructured textual parts as well as structured parts such as tables, mathematical formulas, diagrams, and domain-specific information such as chemical names, bio-sequences, etc.
Such kind of information brings complexity in processing such documents; however, data is the lifeblood of many applications, and its preservation, analysis, enrichment, and use are key for applications in several domains. In order to benefit from the scientific-technical knowledge present in such documents, e.g., for decision-making or for professional search and analytics, there is an urgent need for analyzing, enriching, and linking such data by employing state-of-the-art Semantic Web technologies and AI methods.
However, as they are heterogeneous and are written using domain-specific terminology applying the existing semantic technologies is not straightforward. To address the challenges mentioned above, Semantic Web Technologies, Natural Language Processing (NLP) techniques, Deep Neural Networks (DNN), and Large Language Models (LLMs) must be leveraged in order to provide efficient and effective solutions for creating easily accessible and machine-understandable knowledge.
Contact us if you did not make it on time!
The workshop accepts contributions in all topics related to semantic web technologies and deep learning focused (but not limited) to:
The submissions must be in English and adhere to the CEUR-WS one-column template (see Session 2: The New CEURART Style). The papers should be submitted as PDF files to EasyChair. The review process will be single-blind. Please be aware that at least one author per paper must be registered and attend the workshop to present the work and that ESWC is a 100% in person conference.
We will consider three different submission types:
Submissions should not exceed the indicated number of pages, including any diagrams and references.
Each submission will be reviewed by three independent reviewers on the basis of relevance for the workshop, novelty/originality, significance, technical quality and correctness, quality and clarity of presentation, quality of references and reproducibility.
The accepted papers will be available on the Workshop website. The proceedings will be published in a CEUR-WS volume and consequently indexed on Google Scholar, DBLP, and Scopus.
All the information to register and attend the workshop can be found on the ESWC registration page.
SemTech4STLD workshop will take place on May 26th, 2024.
Timing | Content |
---|---|
14:00 14:05 |
Opening & Welcome
|
14:05 14:50 |
Keynote and Q&A on Understanding Scientific and Societal Adoption of Scientific Knowledge and Resources Through NLP and Knowledge Graphs
Speaker: Prof. Dr. Stefan Dietze GESIS – Leibniz Institute for the Social Sciences & Heinrich-Heine-University Düsseldorf Abstract: Scientific discourse is scattered across unstructured scholarly publications and increasingly takes place online, e.g. in news or social media. Understanding the state-of-the-art in specific research fields, involved data, software, or methods, and their impact on both science and society requires substantial efforts and has become increasingly challenging. At the same time, societal debates about topics such as COVID or climate change have demonstrated the impact of science discourse on public opinion, policies, and society as a whole. This talk will provide an overview of a range of works that use deep learning-based NLP, such as PLMs and LLMs, to construct and use knowledge graphs about scientific discourse. These include, on the one hand, approaches that extract metadata about scholarly entities, such as code, data, tasks or machine learning models from scientific publications to enable machine-interpretable research information and understand dependencies between scholarly artefacts. On the other hand, we introduce NLP methods and knowledge graphs that enable an understanding of societal discourse about science, e.g. on Twitter/X, and facilitate interdisciplinary research into (mis-)representation and -information of scientific claims and findings in societal debates. Short Bio: Stefan Dietze is Professor for Data & Knowledge Engineering at Heinrich-Heine-University Düsseldorf (HHU), and scientific director of the Department of Knowledge Technologies for the Social Science (KTS) at GESIS - Leibniz Institute for the Social Sciences. He also is deputy director at the Heine Center for Artificial Intelligence & Data Science (HeiCAD), and an affiliated member at the Düsseldorf Institute for Internet & Democracy (DIID) and the L3S Research Center of the Leibniz University Hanover, Germany. His research interests are at the intersection of information retrieval, knowledge graphs, and NLP and his work is concerned with the extraction, fusion and search of knowledge and data, in particular, on the Web. His work has been published in top-tier conferences such as CIKM, EMNLP, ISWC, SIGIR, NAACL, or WebConf, where he also frequently serves as PC and/or organization committee member. |
14:50 15:30 Paper Session I |
Paper I: GerPS-NER: A Dataset for Named Entity Recognition to Support Public Service Process Creation in Germany Leila Feddoul, Sarah T. Bachinger, Clara Lachenmaier, Sebastian Apel, Pirmin Karg, Norman Klewer, Denys Forshayt, Robin Erd and Marianne Mauch, (12 min + 3 Q&A)
Paper II: Automating Citation Placement with Natural Language Processing and Transformers Davide Buscaldi, Danilo Dessì, Enrico Motta, Marco Murgia, Francesco Osborne and Diego Reforgiato, (10 min + 3 Q&A)
Paper III: Combining Knowledge Graphs and Large Language Models to Ease Knowledge Access in Software Architecture ResearchAngelika Kaplan, Jan Keim, Marco Schneider, Anne Koziolek and Ralf Reussner, (10 min + 3 Q&A)
|
15:30 16:00 |
Coffee Break
|
16:00 16:35 |
Invited Talk and Q&A on Semantic Web and Machine Learning Systems for Intelligent Systems in Complex Domains Speaker: Prof. Dr. Marta Sabou Vienna University of Economics and Business (WU)
Abstract: Creating intelligent applications that valorise complex domain data such as in the scientific, technical, and legal domain often calls for solutions that combine learning and symbolic artificial intelligence (AI) methods. In line with such developments, in the first part of this talk, we focus on describing a new sub-area of AI that focuses on combining Machine Learning components with techniques developed by the Semantic Web community—Semantic Web Machine Learning (SWeML). We report on the results of a systematic mapping study during which we analysed nearly 500 papers published in the past decade in this area, where we focused on evaluating architectural and application-specific features of such systems. In the second part of the talk, we describe the development and evaluation of a concrete SWeML system that aims to extract key elements from official Austrian permits, including the Issuing Authority, the Operator of the facility in question, the Reference Number, and the Issuing Date. We hope that our lessons learned both about this area as a whole (through the survey of SWeML systems) and the concrete system we built will provide inspiration for researchers and practitioners working with such complex data as in the legal domain and beyond. Short Bio: Prof. Dr. Marta Sabou is a professor for Information Systems and Business Engineering at the Vienna University of Economics and Business (WU) and the Head of Institute for Data, Process and Knowledge Management (DPKM). She holds a PhD in Artificial Intelligence from Vrije Universiteit Amsterdam, for which she won the IEEE Intelligent System’s Ten to Watch Award in 2006. During her career, she performed Artificial Intelligence (AI) research as Research Fellow at the Open University UK, Assistant Professor at MODUL University Vienna, Key Expert in Semantic Technologies at Siemens and FWF Elise-Richter Fellow at the Vienna University of Technology. Prof. Sabou leads the Semantic Systems research group, which performs foundational and applied research at the intersection of the Semantic Web, Machine Learning and Human Computation research areas. Her group’s research topics range from knowledge engineering (knowledge graphs and their evaluation, data integration) to the development of novel intelligent systems that combine both symbolic and sub-symbolic AI techniques, i.e., neuro-symbolic systems. This foundational research underpins an active involvement in applied research in terms of developing advanced functionalities (e.g., system explainability and auditability) in application areas ranging from tourism and cultural heritage to mission critical domains enabled by complex cyber-physical (social) systems such as smart grids, smart buildings, smart factories (as part of Industry 4.0-5.0). Increasingly, the group addresses topics in the area of Digital Humanism such as the auditing of AI systems and the involvement of human stakeholders in the design of intelligent systems. Prof. Sabou is an accomplished academic (close to 150 peer-reviewed papers, h-index 46) and takes an active role in the Semantic Web research community as an editorial board member for two journals (SWJ, NAI) and conference organiser. |
16:35 17:50 Paper Session II |
Paper I: Extracting licence information from web resources with a Large Language Model Enrico Daga, Jason Carvalho and Alba Catalina Morales Tirado. (12min + 3Q&A)
Paper II: ChatGPT vs. Google Gemini: Assessing AI Frontiers for Patent Prior Art Search Using European Search Reports Renukswamy Chikkamath, Ankit Sharma, Christoph Hewel and Markus Endres. (12min + 3Q&A)
Paper III: Bridging the Innovation Gap: Leveraging Patent Information for Scientists by Constructing a Patent-centric Knowledge Graph Hidir Aras, Rima Dessi, Farag Saad and Lei Zhang. (10min + 3Q&A)
Paper IV: Investigating Environmental, Social, and Governance (ESG) Discussions in News: A Knowledge Graph Analysis Empowered by AI Simone Angioni, Sergio Consoli, Danilo Dessì, Francesco Osborne, Diego Reforgiato and Angelo Salatino. (12min + 3Q&A)
Paper V: PRICER: Leveraging Few-Shot Learning with Fine-Tuned Large Language Models for Unstructured Economic Data Matt White, Declan O'Sullivan and Pj Wall. (12min + 3Q&A)
|
17:45 18:00 |
Closing --- Presentations
|