Search results

1 – 10 of over 8000

Open Access

Article

Publication date: 2 April 2024

Automated Dewey Decimal Classification of Swedish library metadata using Annif software

Koraljka Golub, Osma Suominen, Ahmed Taiye Mohammed, Harriet Aagaard and Olof Osterman

In order to estimate the value of semi-automated subject indexing in operative library catalogues, the study aimed to investigate five different automated implementations of an…

HTML

PDF (187 KB)

Downloads

377

Abstract

Purpose

In order to estimate the value of semi-automated subject indexing in operative library catalogues, the study aimed to investigate five different automated implementations of an open source software package on a large set of Swedish union catalogue metadata records, with Dewey Decimal Classification (DDC) as the target classification system. It also aimed to contribute to the body of research on aboutness and related challenges in automated subject indexing and evaluation.

Design/methodology/approach

On a sample of over 230,000 records with close to 12,000 distinct DDC classes, an open source tool Annif, developed by the National Library of Finland, was applied in the following implementations: lexical algorithm, support vector classifier, fastText, Omikuji Bonsai and an ensemble approach combing the former four. A qualitative study involving two senior catalogue librarians and three students of library and information studies was also conducted to investigate the value and inter-rater agreement of automatically assigned classes, on a sample of 60 records.

Findings

The best results were achieved using the ensemble approach that achieved 66.82% accuracy on the three-digit DDC classification task. The qualitative study confirmed earlier studies reporting low inter-rater agreement but also pointed to the potential value of automatically assigned classes as additional access points in information retrieval.

Originality/value

The paper presents an extensive study of automated classification in an operative library catalogue, accompanied by a qualitative study of automated classes. It demonstrates the value of applying semi-automated indexing in operative information retrieval systems.

Details

Journal of Documentation, vol. ahead-of-print no. ahead-of-print

Type: Research Article

DOI:

ISSN: 0022-0418

Keywords

View access options

Article

Publication date: 1 December 2000

The automation of controlled vocabulary subject indexing of medical journal articles

David Roberts and Clive Souter

This article discusses the possibility of the automation of sophisticated subject indexing of medical journal articles. Approaches to subject descriptor assignment in information…

HTML

PDF (103 KB)

Downloads

754

Abstract

This article discusses the possibility of the automation of sophisticated subject indexing of medical journal articles. Approaches to subject descriptor assignment in information retrieval research are usually either based upon the manual descriptors in the database or generation of search parameters from the text of the article. The principles of the Medline indexing system are described, followed by a summary of a pilot project, based upon the Amed database. The results suggest that a more extended study, based upon Medline, should encompass various components: Extraction of ‘concept strings’ from titles and abstracts of records, based upon linguistic features characteristic of medical literature. Use of the Unified Medical Language System (UMLS) for identification of controlled vocabulary descriptors. Coordination of descriptors, utilising features of the Medline indexing system. The emphasis should be on system manipulation of data, based upon input, available resources and specifically designed rules.

Details

Aslib Proceedings, vol. 52 no. 10

Type: Research Article

DOI:

ISSN: 0001-253X

Keywords

View access options

Article

Publication date: 13 October 2023

Identification of social scientifically relevant topics in an interview repository: a natural language processing experiment

Judit Gárdos, Julia Egyed-Gergely, Anna Horváth, Balázs Pataki, Roza Vajda and András Micsik

The present study is about generating metadata to enhance thematic transparency and facilitate research on interview collections at the Research Documentation Centre, Centre for…

HTML

PDF (2.3 MB)

Downloads

112

Abstract

Purpose

The present study is about generating metadata to enhance thematic transparency and facilitate research on interview collections at the Research Documentation Centre, Centre for Social Sciences (TK KDK) in Budapest. It explores the use of artificial intelligence (AI) in producing, managing and processing social science data and its potential to generate useful metadata to describe the contents of such archives on a large scale.

Design/methodology/approach

The authors combined manual and automated/semi-automated methods of metadata development and curation. The authors developed a suitable domain-oriented taxonomy to classify a large text corpus of semi-structured interviews. To this end, the authors adapted the European Language Social Science Thesaurus (ELSST) to produce a concise, hierarchical structure of topics relevant in social sciences. The authors identified and tested the most promising natural language processing (NLP) tools supporting the Hungarian language. The results of manual and machine coding will be presented in a user interface.

Findings

The study describes how an international social scientific taxonomy can be adapted to a specific local setting and tailored to be used by automated NLP tools. The authors show the potential and limitations of existing and new NLP methods for thematic assignment. The current possibilities of multi-label classification in social scientific metadata assignment are discussed, i.e. the problem of automated selection of relevant labels from a large pool.

Originality/value

Interview materials have not yet been used for building manually annotated training datasets for automated indexing of scientifically relevant topics in a data repository. Comparing various automated-indexing methods, this study shows a possible implementation of a researcher tool supporting custom visualizations and the faceted search of interview collections.

Details

Journal of Documentation, vol. 80 no. 2

Type: Research Article

DOI:

ISSN: 0022-0418

Keywords

View access options

Article

Publication date: 2 September 2014

Enhancing social tagging with automated keywords from the Dewey Decimal Classification

Koraljka Golub, Marianne Lykke and Douglas Tudhope

The purpose of this paper is to explore the potential of applying the Dewey Decimal Classification (DDC) as an established knowledge organization system (KOS) for enhancing social…

HTML

PDF (272 KB)

Downloads

1753

Abstract

Purpose

The purpose of this paper is to explore the potential of applying the Dewey Decimal Classification (DDC) as an established knowledge organization system (KOS) for enhancing social tagging, with the ultimate purpose of improving subject indexing and information retrieval.

Design/methodology/approach

Over 11,000 Intute metadata records in politics were used. Totally, 28 politics students were each given four tasks, in which a total of 60 resources were tagged in two different configurations, one with uncontrolled social tags only and another with uncontrolled social tags as well as suggestions from a controlled vocabulary. The controlled vocabulary was DDC comprising also mappings from the Library of Congress Subject Headings.

Findings

The results demonstrate the importance of controlled vocabulary suggestions for indexing and retrieval: to help produce ideas of which tags to use, to make it easier to find focus for the tagging, to ensure consistency and to increase the number of access points in retrieval. The value and usefulness of the suggestions proved to be dependent on the quality of the suggestions, both as to conceptual relevance to the user and as to appropriateness of the terminology.

Originality/value

No research has investigated the enhancement of social tagging with suggestions from the DDC, an established KOS, in a user trial, comparing social tagging only and social tagging enhanced with the suggestions. This paper is a final reflection on all aspects of the study.

Details

Journal of Documentation, vol. 70 no. 5

Type: Research Article

DOI:

ISSN: 0022-0418

Keywords

View access options

Article

Publication date: 1 March 1988

Subject Access to Library Catalogues in Scottish University, Central Institution and College of Education Libraries: A Survey

John C. Crawford

The findings of a survey of Scottish university, central institution and college of education libraries to assess present and planned subject access to their catalogues and…

HTML

PDF (554 KB)

Downloads

Abstract

The findings of a survey of Scottish university, central institution and college of education libraries to assess present and planned subject access to their catalogues and whether online catalogues are likely to improve subject access are reported. The results are analysed and the findings discussed in relation to published studies of subject access in online catalogues. It is concluded that greater attention needs to be paid to subject access both by librarians in specifying automated systems and by system suppliers in responding to specifications.

Details

Library Review, vol. 37 no. 3

Type: Research Article

DOI:

ISSN: 0024-2535

Keywords

View access options

Article

Publication date: 1 September 1997

Network‐accessible resources and the redefinition of technical services

Neil Jones

States that as use of networks becomes more innovative and widespread in higher education libraries, current approaches to the organization of network‐accessible resources reveal…

HTML

PDF (119 KB)

Downloads

583

Abstract

States that as use of networks becomes more innovative and widespread in higher education libraries, current approaches to the organization of network‐accessible resources reveal flaws. Moving forward from the recommendations of the Follett Report, and adopting an approach which seeks to redefine conceptually conventional practices and standards the study examines, from a technical services perspective, issues and approaches relating to the development of existing cataloguing rules and practices, and machine‐readable standards, and proposes these standards as the most effective means of enhancing accessibility to electronic resources. Characterizes the current period as one of organizational, technological and conceptual transition, and addresses the broader issue of academic network‐accessibility in the local, regional, national and international context. Additionally, identifies the challenges to and implications for conventional, and future, technical services operations of these trends.

Details

New Library World, vol. 98 no. 5

Type: Research Article

DOI:

ISSN: 0307-4803

Keywords

View access options

Article

Publication date: 1 February 1974

Vine Volume 4 Issue 2 1974

VINE is a Very Informal NEwsletter produced three or four times a year by the Information Officer for Library Automation and financed by the British Library Research and…

HTML

PDF (1.3 MB)

Downloads

Abstract

VINE is a Very Informal NEwsletter produced three or four times a year by the Information Officer for Library Automation and financed by the British Library Research and Development Department. It is issued free of charge on request to interested librarians, systems staff and library college lecturers. VINEs objective is to provide an up‐to‐date picture of work being done in U.K. library automation projects which has not been reported elsewhere.

Details

VINE, vol. 4 no. 2

Type: Research Article

DOI:

ISSN: 0305-5728

View access options

Article

Publication date: 1 February 1988

INFO: a Cardbox‐plus index to sources of computer and telecommunications information

Julia M. Johnson

This article describes the design, use and evolution over four years of INFO, a Cardbox‐plus based file containing bibliographic and other data relating to the computer and…

HTML

PDF (233 KB)

Downloads

2610

Abstract

This article describes the design, use and evolution over four years of INFO, a Cardbox‐plus based file containing bibliographic and other data relating to the computer and telecommunications industries. INFO was designed and implemented by the author who provides an information service to the UK‐based consultantcy PA Computers and Telecommunications Ltd. What began as a listing of articles in current periodicals is now an index to potentially useful sources of information. These sources may be:

Details

Program, vol. 22 no. 2

Type: Research Article

DOI:

ISSN: 0033-0337

View access options

Article

Publication date: 1 May 2006

Automated subject classification of textual web documents

Koraljka Golub

To provide an integrated perspective to similarities and differences between approaches to automated classification in different research communities (machine learning…

HTML

PDF (127 KB)

Downloads

2223

Abstract

Purpose

To provide an integrated perspective to similarities and differences between approaches to automated classification in different research communities (machine learning, information retrieval and library science), and point to problems with the approaches and automated classification as such.

Design/methodology/approach

A range of works dealing with automated classification of full‐text web documents are discussed. Explorations of individual approaches are given in the following sections: special features (description, differences, evaluation), application and characteristics of web pages.

Findings

Provides major similarities and differences between the three approaches: document pre‐processing and utilization of web‐specific document characteristics is common to all the approaches; major differences are in applied algorithms, employment or not of the vector space model and of controlled vocabularies. Problems of automated classification are recognized.

Research limitations/implications

The paper does not attempt to provide an exhaustive bibliography of related resources.

Practical implications

As an integrated overview of approaches from different research communities with application examples, it is very useful for students in library and information science and computer science, as well as for practitioners. Researchers from one community have the information on how similar tasks are conducted in different communities.

Originality/value

To the author's knowledge, no review paper on automated text classification attempted to discuss more than one community's approach from an integrated perspective.

Details

Journal of Documentation, vol. 62 no. 3

Type: Research Article

DOI:

ISSN: 0022-0418

Keywords

View access options

Article

Publication date: 1 February 1979

The role of automated subject switching in a distributed information network

R.T. Niehoff and S. Kwasny

This article discusses some of the forces at work within today's online database environment which could lead to the emergence of a distributed information network. Several…

HTML

PDF (778 KB)

Downloads

Abstract

This article discusses some of the forces at work within today's online database environment which could lead to the emergence of a distributed information network. Several important modules in such a network are identified, including an automated subject switching module. Switching options include: exact matching, equivalency matchings and word‐ and phrase‐stem matching. Research investigations, critical issues, and preliminary findings with regard to switching options and strategies are reported. It is anticipated that one of the primary benefits from automated subject switching will be much greater utlization of the online STI resource.

Details

Online Review, vol. 3 no. 2

Type: Research Article

DOI:

ISSN: 0309-314X

Access

Year

Content type

1 – 10 of over 8000

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Details

Keywords

Abstract

Details

Keywords

Abstract

Details

Abstract

Details

Abstract

Purpose

Design/methodology/approach

Findings

Research limitations/implications

Practical implications

Originality/value

Details

Keywords

Abstract

Details

Access

Year

Content type

We’re listening — tell us what you think

Something didn’t work…

All feedback is valuable

Join us on our journey

Platform update page

Questions & More Information