Search results

1 – 10 of over 287000
Article
Publication date: 21 December 2021

Laouni Djafri

This work can be used as a building block in other settings such as GPU, Map-Reduce, Spark or any other. Also, DDPML can be deployed on other distributed systems such as P2P…

400

Abstract

Purpose

This work can be used as a building block in other settings such as GPU, Map-Reduce, Spark or any other. Also, DDPML can be deployed on other distributed systems such as P2P networks, clusters, clouds computing or other technologies.

Design/methodology/approach

In the age of Big Data, all companies want to benefit from large amounts of data. These data can help them understand their internal and external environment and anticipate associated phenomena, as the data turn into knowledge that can be used for prediction later. Thus, this knowledge becomes a great asset in companies' hands. This is precisely the objective of data mining. But with the production of a large amount of data and knowledge at a faster pace, the authors are now talking about Big Data mining. For this reason, the authors’ proposed works mainly aim at solving the problem of volume, veracity, validity and velocity when classifying Big Data using distributed and parallel processing techniques. So, the problem that the authors are raising in this work is how the authors can make machine learning algorithms work in a distributed and parallel way at the same time without losing the accuracy of classification results. To solve this problem, the authors propose a system called Dynamic Distributed and Parallel Machine Learning (DDPML) algorithms. To build it, the authors divided their work into two parts. In the first, the authors propose a distributed architecture that is controlled by Map-Reduce algorithm which in turn depends on random sampling technique. So, the distributed architecture that the authors designed is specially directed to handle big data processing that operates in a coherent and efficient manner with the sampling strategy proposed in this work. This architecture also helps the authors to actually verify the classification results obtained using the representative learning base (RLB). In the second part, the authors have extracted the representative learning base by sampling at two levels using the stratified random sampling method. This sampling method is also applied to extract the shared learning base (SLB) and the partial learning base for the first level (PLBL1) and the partial learning base for the second level (PLBL2). The experimental results show the efficiency of our solution that the authors provided without significant loss of the classification results. Thus, in practical terms, the system DDPML is generally dedicated to big data mining processing, and works effectively in distributed systems with a simple structure, such as client-server networks.

Findings

The authors got very satisfactory classification results.

Originality/value

DDPML system is specially designed to smoothly handle big data mining classification.

Details

Data Technologies and Applications, vol. 56 no. 4
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 26 January 2021

Adli Hamdam, Ruzita Jusoh, Yazkhiruni Yahya, Azlina Abdul Jalil and Nor Hafizah Zainal Abidin

The role of big data and data analytics in the audit engagement process is evident. Notwithstanding, understanding how big data influences cognitive processes and, consequently…

2360

Abstract

Purpose

The role of big data and data analytics in the audit engagement process is evident. Notwithstanding, understanding how big data influences cognitive processes and, consequently, on the auditors’ judgment decision-making process is limited. The purpose of this paper is to present a conceptual framework on the cognitive process that may influence auditors’ judgment decision-making in the big data environment. The proposed framework predicts the relationships among data visualization integration, data processing modes, task complexity and auditors’ judgment decision-making.

Design/methodology/approach

The methodology to accomplish the conceptual framework is based on a thorough literature review that consists of theoretical discussions and comparative studies of other authors’ works and thinking. It also involves summarizing and interpreting previous contributions subjectively and narratively and extending the work in some fashion. Based on this approach, this paper formulates four propositions about data visualization integration, data processing modes, task complexity and auditors’ judgment decision-making. The proposed framework was built from cognitive theory addressing how auditors process data into useful information to make judgment decision-making.

Findings

The proposed framework expects that the cognitive process of data visualization integration and intuitive data processing mode will improve auditors’ judgment decision-making. This paper also contends that task complexity may influence the cognitive process of data visualization integration and processing modes because of the voluminous nature of data and the complexity of business processes. Hence, it is also expected that the relationships between data visualization integration and audit judgment decision-making and between processing mode and audit judgment decision-making will be moderated by task complexity.

Research limitations/implications

There is a dearth of studies examining how big data and big data analytics affect auditors’ cognitive processes in making decisions. This paper will help researchers and auditors understand the behavioral consequences of data visualization integration and data processing mode in making judgment decision-making, given a certain level of task complexity.

Originality/value

With the advent of big data and the evolution of innovative audit procedures, the constructed framework can be used as a theoretical foundation for future empirical studies concerning auditors’ judgment decision-making. It highlights the potential of big data to transform the nature and practice of accounting and auditing.

Details

Accounting Research Journal, vol. 35 no. 1
Type: Research Article
ISSN: 1030-9616

Keywords

Article
Publication date: 29 October 2021

Yanchao Rao and Ken Huijin Guo

The US Securities and Exchange Commission (SEC) requires public companies to file structured data in eXtensible Business Reporting Language (XBRL). One of the key arguments behind…

Abstract

Purpose

The US Securities and Exchange Commission (SEC) requires public companies to file structured data in eXtensible Business Reporting Language (XBRL). One of the key arguments behind the XBRL mandate is that the technical standard can help improve processing efficiency for data aggregators. This paper aims to empirically test the data processing efficiency hypothesis.

Design/methodology/approach

To test the data processing efficiency hypothesis, the authors adopt a two-sample research design by using data from Compustat: a pooled sample (N = 61,898) and a quasi-experimental sample (N = 564). The authors measure data processing efficiency as the time lag between the dates of 10-K filings on the SEC’s EDGAR system and the dates of related data finalized in the Compustat database.

Findings

The statistical results show that after controlling for potential effects of firm size, age, fiscal year and industry, XBRL has a non-significant impact on data efficiency. It suggests that the data processing efficiency benefit may have been overestimated.

Originality/value

This study provides some timely empirical evidence to the debate as to whether XBRL can improve data processing efficiency. The non-significant results suggest that it may be necessary to revisit the mandate of XBRL reporting in the USA and many other countries.

Details

International Journal of Accounting & Information Management, vol. 30 no. 1
Type: Research Article
ISSN: 1834-7649

Keywords

Article
Publication date: 1 March 1974

H.H. Von Muldau

“One picture says more than 1000 words”. This saving offers us information about one of the most important features of human beings. Human beings mostly relate their actions to…

Abstract

“One picture says more than 1000 words”. This saving offers us information about one of the most important features of human beings. Human beings mostly relate their actions to their surroundings by optical means. No other information channel is as well developed as the optical channel and only the optical channel is able to process very large quantities of data at one time, bearing in mind the large number of steps which are between the image received by the eyes and the understanding of the contents of the picture by the brain.

Details

Industrial Robot: An International Journal, vol. 1 no. 3
Type: Research Article
ISSN: 0143-991X

Article
Publication date: 14 June 2013

Bojan Božić and Werner Winiwarter

The purpose of this paper is to present a showcase of semantic time series processing which demonstrates how this technology can improve time series processing and community…

Abstract

Purpose

The purpose of this paper is to present a showcase of semantic time series processing which demonstrates how this technology can improve time series processing and community building by the use of a dedicated language.

Design/methodology/approach

The authors have developed a new semantic time series processing language and prepared showcases to demonstrate its functionality. The assumption is an environmental setting with data measurements from different sensors to be distributed to different groups of interest. The data are represented as time series for water and air quality, while the user groups are, among others, the environmental agency, companies from the industrial sector and legal authorities.

Findings

A language for time series processing and several tools to enrich the time series with meta‐data and for community building have been implemented in Python and Java. Also a GUI for demonstration purposes has been developed in PyQt4. In addition, an ontology for validation has been designed and a knowledge base for data storage and inference was set up. Some important features are: dynamic integration of ontologies, time series annotation, and semantic filtering.

Research limitations/implications

This paper focuses on the showcases of time series semantic language (TSSL), but also covers technical aspects and user interface issues. The authors are planning to develop TSSL further and evaluate it within further research projects and validation scenarios.

Practical implications

The research has a high practical impact on time series processing and provides new data sources for semantic web applications. It can also be used in social web platforms (especially for researchers) to provide a time series centric tagging and processing framework.

Originality/value

The paper presents an extended version of the paper presented at iiWAS2012.

Details

International Journal of Web Information Systems, vol. 9 no. 2
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 19 February 2021

C. Lakshmi and K. Usha Rani

Resilient distributed processing technique (RDPT), in which mapper and reducer are simplified with the Spark contexts and support distributed parallel query processing.

Abstract

Purpose

Resilient distributed processing technique (RDPT), in which mapper and reducer are simplified with the Spark contexts and support distributed parallel query processing.

Design/methodology/approach

The proposed work is implemented with Pig Latin with Spark contexts to develop query processing in a distributed environment.

Findings

Query processing in Hadoop influences the distributed processing with the MapReduce model. MapReduce caters to the works on different nodes with the implementation of complex mappers and reducers. Its results are valid for some extent size of the data.

Originality/value

Pig supports the required parallel processing framework with the following constructs during the processing of queries: FOREACH; FLATTEN; COGROUP.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 14 no. 2
Type: Research Article
ISSN: 1756-378X

Keywords

Article
Publication date: 2 October 2018

Dawn M. Russell and David Swanson

The purpose of this paper is to investigate the mediators that occupy the gap between information processing theory and supply chain agility. In today’s Mach speed business…

1743

Abstract

Purpose

The purpose of this paper is to investigate the mediators that occupy the gap between information processing theory and supply chain agility. In today’s Mach speed business environment, managers often install new technology and expect an agile supply chain when they press<Enter>. This study reveals the naivety of such an approach, which has allowed new technology to be governed by old processes.

Design/methodology/approach

This work takes a qualitative approach to the dynamic conditions surrounding information processing and its connection to supply chain agility through the assessment of 60 exemplar cases. The situational conditions that have created the divide between information processing and supply chain agility are studied.

Findings

The agility adaptation typology (AAT) defining three types of adaptations and their mediating constructs is presented. Type 1: information processing, is generally an exercise in synchronization that can be used to support assimilation. Type 2: demand sensing, is where companies are able to incorporate real-time data into everyday processes to better understand demand and move toward a real-time environment. Type 3: supply chain agility, requires fundamentally new thinking in the areas of transformation, mindset and culture.

Originality/value

This work describes the reality of today’s struggle to achieve supply chain agility, providing guidelines and testable propositions, and at the same time, avoids “ivory tower prescriptions,” which exclude the real world details from the research process (Meredith, 1993). By including the messy real world details, while difficult to understand and explain, the authors are able to make strides in the AAT toward theory that explains and guides the manager’s everyday reality with all of its messy real world details.

Details

The International Journal of Logistics Management, vol. 30 no. 1
Type: Research Article
ISSN: 0957-4093

Keywords

Article
Publication date: 1 June 1976

B.M. Doouss and G.L. Collins

This monograph defines distributed intelligence and discusses the relationship of distributed intelligence to data base, justifications for using the technique, and the approach…

67

Abstract

This monograph defines distributed intelligence and discusses the relationship of distributed intelligence to data base, justifications for using the technique, and the approach to successful implementation of the technique. The approach is then illustrated by reference to a case study of experience in Birds Eye Foods. The planning process by which computing strategy for the company was decided is described, and the planning conclusions reached to date are given. The current state of development in the company is outlined and the very real savings so far achieved are specified. Finally, the main conclusions of the monograph are brought together. In essence these conclusions are that major savings are achievable using distributed intelligence, and that the implementation of a company data processing plan can be made quicker and simpler by its use. However, careful central control must be maintained so as to avoid fragmentation of machine, language skills, and application taking place.

Details

Management Decision, vol. 14 no. 6
Type: Research Article
ISSN: 0025-1747

Article
Publication date: 2 October 2019

Sabrina Lechler, Angelo Canzaniello, Bernhard Roßmann, Heiko A. von der Gracht and Evi Hartmann

Particularly in volatile, uncertain, complex and ambiguous (VUCA) business conditions, staff in supply chain management (SCM) look to real-time (RT) data processing to reduce…

1610

Abstract

Purpose

Particularly in volatile, uncertain, complex and ambiguous (VUCA) business conditions, staff in supply chain management (SCM) look to real-time (RT) data processing to reduce uncertainties. However, based on the premise that data processing can be perfectly mastered, such expectations do not reflect reality. The purpose of this paper is to investigate whether RT data processing reduces SCM uncertainties under real-world conditions.

Design/methodology/approach

Aiming to facilitate communication on the research question, a Delphi expert survey was conducted to identify challenges of RT data processing in SCM operations and to assess whether it does influence the reduction of SCM uncertainty. In total, 14 prospective statements concerning RT data processing in SCM operations were developed and evaluated by 68 SCM and data-science experts.

Findings

RT data processing was found to have an ambivalent influence on the reduction of SCM complexity and associated uncertainty. Analysis of the data collected from the study participants revealed a new type of uncertainty related to SCM data itself.

Originality/value

This paper discusses the challenges of gathering relevant, timely and accurate data sets in VUCA environments and creates awareness of the relationship between data-related uncertainty and SCM uncertainty. Thus, it provides valuable insights for practitioners and the basis for further research on this subject.

Details

International Journal of Physical Distribution & Logistics Management, vol. 49 no. 10
Type: Research Article
ISSN: 0960-0035

Keywords

Book part
Publication date: 7 May 2019

Francesco Ciclosi, Paolo Ceravolo, Ernesto Damiani and Donato De Ieso

This chapter analyzes the compliance of some category of Open Data in Politics with EU General Data Protection Regulation (GDPR) requirements. After clarifying the legal basis of…

Abstract

This chapter analyzes the compliance of some category of Open Data in Politics with EU General Data Protection Regulation (GDPR) requirements. After clarifying the legal basis of this framework, with specific attention to the processing procedures that conform to the legitimate interests pursued by the data controller, including open data licenses or anonymization techniques, that can result in partial application of the GDPR, but there is no generic guarantee, and, as a consequence, an appropriate process of analysis and management of risks is required.

Details

Politics and Technology in the Post-Truth Era
Type: Book
ISBN: 978-1-78756-984-3

Keywords

1 – 10 of over 287000