DATA 2012 Abstracts


Area 1 - Data Warehousing and Business Intelligence

Full Papers
Paper Nr: 17
Title:

Cloud based Privacy Preserving Data Mining with Decision Tree

Authors:

Echo P. Zhang, Yi-Jun He and Lucas C. K. Hui

Abstract: Privacy Preserving Data Mining (PPDM) aims at performing data mining among multiple parties, and at the meantime, no single party suffers the threat of releasing private data to any others. Nowadays, cloud service becomes more and more popular. However, how to deal with privacy issues of cloud service is still developing. This paper is one of the first researches in cloud server based PPDM. We propose a novel protocol that the cloud server performs data mining in encrypted databases, and our solution can guarantee the privacy of each client. This scheme can protect client from malicious users. With aid of a hardware box, the scheme can also protect clients from untrusted cloud server. Another novel feature of this solution is that it works even when the database from different parties are overlapping.

Paper Nr: 23
Title:

Flexible Information Management, Exploration and Analysis in SAP HANA

Authors:

Christof Bornhoevd, Robert Kubis, Wolfgang Lehner, Hannes Voigt and Horst Werner

Abstract: Data management is not limited anymore to towering data silos full of perfectly structured, well integrated data. Today, we need to process and make sense of data from diverse sources (public and on-premise), in different application contexts, with different schemas, and with varying degrees of structure and quality. Because of the necessity to define a rigid data schema upfront, fixed-schema database systems are not a good fit for these new scenarios. However, schema is still essential to give data meaning and to process data purposefully. In this paper, we describe a schema-flexible database system that combines a flexible data model with a powerful data query, analysis, and manipulation language that provides both required schema information and the flexibility required for modern information processing and decision support.

Paper Nr: 34
Title:

Applying Personal and Group-based Trust Models in Document Recommendation

Authors:

Chin-Hui Lai, Duen-Ren Liu and Cai-Sin Lin

Abstract: Collaborative filtering (CF) recommender systems have been used in various application domains to solve the information-overload problem. Recently, trust-based recommender systems have incorporated the trustworthiness of users into CF techniques to improve the quality of recommendation. Some researchers have proposed rating-based trust models to derive the trust values based on users’ past ratings of items, or based on explicitly specified relations (e.g. friends) or trust relationships. The rating-based trust model may not be effective in CF recommendations, due to unreliable trust values derived from very few past rating records. In this work, we propose a hybrid personal trust model which adaptively combines the rating-based trust model and explicit trust metric to resolve the drawback caused by insufficient past rating records. Moreover, users with similar preferences usually form a group to share items (knowledge) with each other, and thus users’ preferences may be affected by group members. Accordingly, group trust can enhance personal trust to support recommendation from the group perspective. Eventually, we propose a recommendation method based on a hybrid model of personal and group trust to improve recommendation performance. The experiment result shows that the proposed models can improve the prediction accuracy of other trust-based recommender systems.

Paper Nr: 38
Title:

A Virtual Document Approach for Keyword Search in Databases

Authors:

Jaime I. Lopez-Veyna, Victor J. Sosa-Sosa and Ivan Lopez-Arevalo

Abstract: It is clear that in recent years the amount of information available in a variety of data sources, like those found on the Web, has presented an accelerated growth. This information can be classified based on its structure in three different forms: unstructured (free text documents), semi-structured (XML documents) and structured (a relational database or XML database). A search technique that has gained wide acceptance for use in massive data sources, such as the Web, is the keyword based search, which is simple to people who are familiar with the use of Web search engines. Keyword search has become an alternative to users without any knowledge about formal query languages and schema used in structured data. There are some traditional approaches to perform keyword search over relational databases such as Steiner Trees, Candidate Networks and recently Tuple Units. Nevertheless these methods have some limitations. In this paper we propose a Virtual Document (VD) approach for keyword search in databases. We represent the structured information as graphs and propose the use of an index that captures the structural relationships of the information. This approach produce fast and accuracy results in search responses. We have conducted extensive experiments on large-scale real databases and the results demonstrates that our approach achieves high search efficiency and high accuracy for keyword search in databases.

Short Papers
Paper Nr: 25
Title:

Constrained Nonnegative Matrix Factorization based Data Distortion Techniques - Study of Data Privacy and Utility

Authors:

Nirmal Thapa, PengPeng Lin, Lian Liu, Jie Wang and Jun Zhang

Abstract: With the rise of data mining techniques came across the problem of privacy disclosure, that is why it has become one of the top priorities as far as designing the data mining techniques is concerned. In this paper, we briefly discuss the Nonnegative Matrix Factorization (NMF) and the motivation behind using NMF for data representation. We provide the mathematical derivation for NMF with some additional constraints. Based on the mathematical derivations, we propose a couple of novel data distortion strategies. The first technique is called the Constrained Nonnegative Matrix Factorization (CMF) and the second one is Sparsified CNMF. We study the distortion level of each of these algorithms with the other matrix based techniques like SVD and NMF. K-means is used to study the data utility of the two proposed methods. Our experimental results show that, in comparison with standard data distortion techniques, the proposed schemes are very effective in achieving a good tradeoff between data privacy and data utility, and affords a feasible solution to protect sensitive information and promise higher accuracy in decision making. We investigate utility of the perturbed data based on the results from the original data.

Paper Nr: 27
Title:

Kernel Generations for a Diagnosis Model with GP

Authors:

Jongseong Kim and Hoo-Gon Choi

Abstract: An accurate diagnosis model is required to diagnose the medical subjects. The subjects should be diagnosed with high accuracy and recall rate by the model. The laboratory test data are collected from 953 latent subjects having type 2 diabetes mellitus. The results are classified into patient group and normal group by using support vector machine kernels optimized through genetic programming. Genetic programming is applied for the input data twice with absorbing evolution, which is a new approach. The result shows that new approach creates a kernel with 80% accuracy, 0.794 recall rate and 28% reduction of computing time comparing to other typical methods. Also, the suggested kernel can be easily utilized by users having no and little experience on large data.

Paper Nr: 29
Title:

Automatic Subspace Clustering with Density Function

Authors:

Jiwu Zhao and Stefan Conrad

Abstract: Clustering techniques in data mining aim to find interesting patterns in data sets. However, traditional clustering methods are not suitable for large, high-dimensional data. Subspace clustering is an extension of traditional clustering that enables finding clusters in subspaces within a data set, which means subspace clustering is more suitable for detecting clusters in high-dimensional data sets. However, most subspace clustering methods usually require many complicated parameter settings, which are always troublesome to determine, and therefore there are many limitations for applying these subspace clustering methods. In this article, we develop a novel subspace clustering method with a new density function, which computes and represents the density distribution directly in high-dimensional data sets, and furthermore the new method requires as few parameters as possible.

Paper Nr: 33
Title:

Inference in Hierarchical Multidimensional Space

Authors:

Alexandr Savinov

Abstract: In spite of its fundamental importance, inference has not been an inherent function of multidimensional models and analytical applications. These models are mainly aimed at numeric analysis where the notion of inference is not well defined. In this paper we define inference using only multidimensional terms like axes and coordinates as opposed to using logic-based approaches. We propose an inference procedure which is based on a novel formal setting of nested partially ordered sets with operations of projection and de-projection.

Posters
Paper Nr: 45
Title:

Geometric Divide and Conquer Classification for High-dimensional Data

Authors:

Pei Ling Lai, Yang Jin Liang and Alfred Inselberg

Abstract: From the Nested Cavities (abbr. NC) classifier (Inselberg and Avidan, 2000) a powerful new classification approach emerged. For a dataset P and a subset S ¼P the classifer constructs a rule distinguishing the elements of S from those in P.S. The NC is a geometrical algorithm which builds a sequence of nested unbounded parallelopipeds of minimal dimensionality containing disjoint subsets of P, and from which a hypersurface (the rule) containing the subset S is obtained. The partitioning of P.S and S into disjoint subsets is very useful when the original rule obtained is either too complex or imprecise. As illustrated with examples, this separation reveals exquisite insight on the datasetfs structure. Specifically from one of the problems we studied two different types of watermines were separated. From another dataset, two distinct types of ovarian cancer were found. This process is developed and illustrated on a (sonar) dataset with 60 variables and two categories (gminesh and grocksh) resulting in significant understanding of the domain and simplification of the classification rule. Such a situation is generic and occurs with other datasets as illustrated with a similar decompositions of a financial dataset producing two sets of conditions determing gold prices. The divide-and-conquer extension can be automated and also allows the classification of the sub-categories to be done in parallel.

Area 2 - Data Management and Quality

Full Papers
Paper Nr: 16
Title:

On Continuous Top-k Similarity Joins

Authors:

Da Jun Li, En Tzu Wang, Yu-Chou Tsai and Arbee L. P. Chen

Abstract: Given a similarity function and a threshold  within a range of [0, 1], a similarity join query between two sets of records returns pairs of records from the two sets, which have similarity values exceeding or equaling . Similarity joins have received much research attention since it is a fundamental operation used in a wide range of applications such as duplicate detection, data integration, and pattern recognition. Recently, a variant of similarity joins is proposed to avoid the need to set the threshold , i.e. top-k similarity joins. Since data in many applications are generated as a form of continuous data streams, in this paper, we make the first attempt to solve the problem of top-k similarity joins considering a dynamic environment involving a data stream, named continuous top-k similarity joins. Given a set of records as the query, we continuously output the top-k pairs of records, ranked by their similarity values, for the query and the most recent data, i.e. the data contained in the sliding window of a monitored data stream. Two algorithms are proposed to solve this problem. The first one extends an existing approach for static datasets to find the top-k pairs regarding the query and the newly arrived data and then keep the obtained pairs in a candidate result set. As a result, the top-k pairs can be found from the candidate result set. In the other algorithm, the records in the query are preprocessed to be indexed using a novel data structure. By this structure, the data in the monitored stream can be compared with all records in the query at one time, substantially reducing the processing time of finding the top-k results. A series of experiments are performed to evaluate the two proposed algorithms and the experiment results demonstrate that the algorithm with preprocessing outperforms the other algorithm extended from an existing approach for a static environment.

Paper Nr: 35
Title:

Data Quality Sensitivity Analysis on Aggregate Indicators

Authors:

Mario Mezzanzanica, Roberto Boselli, Mirko Cesarini and Fabio Mercorio

Abstract: Decision making activities stress data and information quality requirements. The quality of data sources is frequently very poor, therefore a cleansing process is required before using such data for decision making processes. When alternative (and more trusted) data sources are not available data can be cleansed only using business rules derived from domain knowledge. Business rules focus on fixing inconsistencies, but an inconsistency can be cleansed in different ways (i.e. the correction can be not deterministic), therefore the choice on how to cleanse data can (even strongly) affect the aggregate values computed for decision making purposes. The paper proposes a methodology exploiting Finite State Systems to quantitatively estimate how computed variables and indicators might be affected by the uncertainty related to low data quality, independently from the data cleansing methodology used. The methodology has been implemented and tested on a real case scenario providing effective results.

Short Papers
Paper Nr: 5
Title:

NRank: A Unified Platform Independent Approach for Top-K Algorithms

Authors:

Martin Čech and Jaroslav Pokorný

Abstract: Due to increasing capacity of storage devices and speed of computer networks during last years, it is still more required to sort and search data effectively. A query result containing thousands of rows from a relational database is usually useless and unreadable. In that situation, users may prefer to define constraints and sorting priorities in the query, and see only several top rows from the result. This paper deals with top-k queries problems, extension of relational algebra by new operators and their implementation in a database system. It focuses on optimization of operations join and sort. The work also includes implementation and comparison of some algorithms in standalone .NET library NRank.

Paper Nr: 15
Title:

FIND - A Data Cloud Platform for Financial Data Services

Authors:

Zhicheng Liao, Yun Xiong and Yangyong Zhu

Abstract: In recent years, researchers have paid more interest in dealing with large scale of data. However, it is difficult to discover patterns from various sources of big data flexibly and efficiently. In this paper, we design a data cloud platform for financial data services (FIND), and implement a prototype system to evaluate the performance and usability of the data cloud. FIND consists of a cloud infrastructure, a data resource center and a data service portal. FIND provides high performance computation capability, high quality integrated financial data, sophisticated data mining algorithms, and powerful data services.

Paper Nr: 18
Title:

How Do I Manage My Personal Data? – A Telco Perspective

Authors:

Corrado Moiso, Fabrizio Antonelli and Michele Vescovi

Abstract: Personal data are considered the core of digital services. Data Privacy is the main concern in the currently adopted “organization-centric” approaches for personal data management: this affects the potential benefits arising from a smarter and more valuable use of personal data. We introduce a “user-centric” model for personal data management, where individuals have control over the entire personal data lifecycle from acquisition to storage, from processing to sharing. In particular, the paper analyses the features of a personal data store and discusses how its adoption enables new application scenarios.

Paper Nr: 20
Title:

KIDS - A Model for Developing Evolutionary Database Applications

Authors:

Zhen Hua Liu, Andreas Behrend, Eric Chan, Dieter Gawlick and Adel Ghoneimy

Abstract: Database applications enable users to handle the ever-increasing amount and complexity of data, knowledge as well as the dissemination of information to ensure timely response to critical events. However, the very process of human problem solving, which requires understanding and tracking the evolution of data, knowledge, and events, is still handled mostly by human and not by databases and their applications. In this position paper, we propose KIDS as a model that reflects the way human are solving problems. We propose to use KIDS as a blueprint to extend database technologies to manage data, knowledge, directives and events in a coherent, self-evolving way. Our proposal is based on our experience of building database centric applications that require comprehensive interactions among facts, information, events, and knowledge.

Paper Nr: 37
Title:

Combining Local and Related Context for Word Sense Disambiguation on Specific Domains

Authors:

Franco Rojas-Lopez, Ivan Lopez-Arevalo and Victor Sosa-Sosa

Abstract: In this paper an approach for word sense disambiguation in documents is presented. For Word Sense Disambiguation (WSD), the local and related context for an ambiguous word is extracted, such context is used for retrieve second order vectors from WordNet. Thus two graphs are built at the same time and evaluated individually, finally both results are combined to automatically assign the correct sense for the ambiguous word. The proposed approach was tested on the task #17 of the SemEval 2010 international competition producing promising results compared to other approaches.

Paper Nr: 50
Title:

Towards a Meaningful Analysis of Big Data - Enhancing Data Mining Techniques through a Collaborative Decision Making Environment

Authors:

Nikos Karacapilidis, Stefan Rueping, Georgia Tsiliki and Manolis Tzagarakis

Abstract: Arguing that dealing with data-intensive settings is not a technical problem alone, we propose a hybrid approach that builds on the synergy between machine and human intelligence to facilitate the underlying sense-making and decision making processes. The proposed approach, which can be viewed as an innovative workbench incorporating and orchestrating a set of interoperable services, is illustrated through a real case concerning collaborative subgroup discovery in microarray data. Evaluation results, validating the potential of our approach, are also included.

Paper Nr: 65
Title:

A Data-Centric Approach for Networking Applications

Authors:

Ahmad Ahmad-Kassem, Christophe Bobineau, Christine Collet, Etienne Dublé, Stéphane Grumbach, Fuda Ma, Lourdes Martinez and Stéphane Ubéda

Abstract: The paper introduces our vision for rapid prototyping of heterogeneous and distributed applications. It abstracts a network as a large distributed database providing a unified view of "objects" handled in networks and applications. The applications interact through declarative queries including declarative networking programs (e.g. routing) and/or specific data-oriented distributed algorithms (e.g. distributed join). Case-Based Reasoning is used for optimization of distributed queries by learning when there is no prior knowledge on queried data sources and no related metadata such as data statistics.

Paper Nr: 67
Title:

Skyline Query Processing on Heterogeneous Data - A Conceptual Model

Authors:

Nurul Husna Mohd Saad, Hamidah Ibrahim, Fatimah Sidi and Razali Yaakob

Abstract: Skyline queries have been aggressively researched recently due to its importance in realizing useful and non-trivial application in decision-making environments. However, existing researches so far lack methods to compute skyline queries over heterogeneous data where each data can be represented as either a single certain point or a continuous range. In this paper, we tackle the problem of skyline analysis on heterogeneous data and proposed method that will reduce the number of comparisons between objects as well as gradually compute the probabilistic skyline of each object to be a skyline object. Our model employs two techniques (local pruning and global pruning) for probabilistic skyline query.

Paper Nr: 70
Title:

DBMS meets DSMS - Towards a Federated Solution

Authors:

Andreas Behrend, Dieter Gawlick and Daniela Nicklas

Abstract: In this paper, we describe the requirements and benefits for integrating data stream processing with database management systems. Currently, these technologies focus on very different tasks; streams systems extract instances of patterns from streams of transient data, while database systems store, manage, provide access to, and analyze persistent data. Many applications, e.g., patient care, program trading, or flight supervision, however, depend on the functionality and operational characteristics of both types of systems. We discuss how to design a federated system which provides the benefits of both approaches.

Posters
Paper Nr: 71
Title:

Understanding Worldwide Human Information Needs - Revealing Cultural Influences in HCI by Analyzing Big Data in Interactions

Authors:

Rüdiger Heimgärtner

Abstract: Understanding human information needs worldwide requires the analysis of much data and adequate statistical analysis methods. Factor Analysis and Structural Equation Models (SEM) are a means to reveal structures in data. Data from empirical studies found in literature regarding cultural human computer interaction (HCI) was analyzed using these methods to develop a model of culturally influenced HCI. There are significant differences in HCI style depending on the cultural imprint of the user. Having knowledge about the relationship between culture and HCI using this model, the local human information needs can be predicted for a worldwide scope.

Area 3 - Trust, Privacy and Security

Full Papers
Paper Nr: 52
Title:

NexusDSS: A System for Security Compliant Processing of Data Streams

Authors:

Nazario Cipriani, Christoph Stach, Oliver Dörler and Bernhard Mitschang

Abstract: Technological advances in microelectronic and communication technology are increasingly leading to a highly connected environment equipped with sensors producing a continuous flow of context data. The steadily growing number of sensory context data available enables new application scenarios and drives new processing techniques. The growing pervasion of everyday life with social media and the possibility of interconnecting them with moving objects’ traces, leads to a growing importance of access control for this kind of data since it concerns privacy issues. The challenge in twofold: First mechanisms to control data access and data usage must be established and second efficient and flexible processing of sensible data must be supported. In this paper we present a flexible and extensible security framework which provides mechanisms to enforce requirements for context data access and beyond that support safe processing of sensible context data according to predefined processing rules. In addition and in contrast to previous concepts, our security framework especially supports fine-grained control to contextual data.

Short Papers
Paper Nr: 42
Title:

Towards Process Centered Information Security Management - A Common View for Federated Business Processes and Personal Data Usage Processes

Authors:

Erik Neitzel and Andreas Witt

Abstract: While comparing the progress of our two research projects of developing an information security management system (ISMS) for federated business process landscapes and the enhancement of security of social networks, we discovered a fundamental view congruency concerning the way information security can be handled. This paper deals with a conceptual framework which uses the ISO 27001 and the German BSI IT-Grundschutz Framework as a base for determining a methodology for a process based point of view towards information security management for both federated business processes within business applications and personal data usage processes within social networks. The proposed layers are (1) process layer, (2) application layer, (3) network layer, (4) IT systems layer and (5) infrastructure layer.

Posters
Paper Nr: 47
Title:

Graph-based Campaign Amplification in Telecom Cloud

Authors:

M. Saravanan, Sandeep Akhouri and Loganath Thamizharasu

Abstract: Majority of telecom operators are making a transition from a monolithic, stove-pipe approach of creating services to a more flexible architecture that provides them agility to rapidly develop and deploy services. New revenue streams require an ability to rapidly identify and target dynamic shifts in traffic patterns and subscriber behaviour. As subscriber behaviour morphs with plans, promotions, devices, location and time, this presents challenges and opportunities for an operator to create and launch targeted campaigns. The enormous volume of data being generated requires a scalable platform for processing massive xDR (eg. Call Detail Records). This paper proposes graph databases in a telecom cloud environment for quickly identifying trends, isolating a targeted subscriber base and rapidly launching campaigns. We also highlight the limitations of a conventional relational database in terms of capturing complex relationships as compared to a NoSQL graph database and the benefits of automatic provisioning and deployment in the cloud environment.

Area 4 - Ontologies and the Semantic Web

Short Papers
Paper Nr: 19
Title:

An Architecture based on Ontologies, Agents and Metaheuristics Applied to the Multimedia Service of the Brazilian Digital Television System

Authors:

Toni Ismael Wickert and Arthur Tórgo Gómez

Abstract: With the advent of the Brazilian Digital Television System, that arrives on approximately 95% of Brazilian homes, the users will be able to have an interactive channel by the utilization of the digital television. Thus, will be possible to access the multimedia application server, i.e., to send or to receive emails, to access interactive applications, to watch movies or specific news. This paper proposes the development and the implementation of an architecture that includes a module that suggests the content to the user according to his profile and another module to optimize the content that will be transmitted. The implementation was developed using ontologies, software agents, Tabu Search and Genetic Algorithm. The validations of the results are done using a metric.

Paper Nr: 22
Title:

Effective and Efficient Online Communication - The Channel Model

Authors:

Anna Fensel, Dieter Fensel, Birgit Leiter and Andreas Thalhammer

Abstract: We discuss the challenge of scalable dissemination approach in a world where the number of communication channels and interaction possibilities is growing exponentially, particularly on the Web, Web 2.0, and semantic channels. Our goal is to enable smaller organizations to fully exploit this potential. We have developed a new methodology based on distinguishing and explicitly interweaving content and communication as a central means for achieving content reusability and thereby scalability over various, heterogeneous channels. Here, we present in detail the communication channel model of our approach.

Paper Nr: 31
Title:

Effective Keyword Search via RDF Annotations

Authors:

Roberto De Virgilio and Lorenzo Dolfi

Abstract: Searching relevant information from Web may be a very tedious task. If people cannot navigate through the Web site, they will quickly leave. Thus, designing effective navigation strategies on Web sites is crucial. In this paper we provide and implement centrality indices to guide the user for an effective navigation of Web pages. We get inspiration from well-know location family problems to compute the center of a graph: a joint use of such indices guarantees the automatic selection of the best starting point. To validate our approach, we have developed a system that implements the techniques described in this paper on top of an engine for keyword-based search over RDF data. Such system exploits an interactive front-end to support the user in the visualization of both annotations and corresponding Web pages. Experiments over widely used benchmarks have shown very good results, in terms of both effectiveness and efficiency.

Paper Nr: 54
Title:

Toward a Product Search Engine based on User Reviews

Authors:

Paolo Fosci and Giuseppe Psaila

Abstract: We address the problem of developing a method for retrieving products exploiting product user-reviews that can be found on the internet. For this purpose, we introduce a ranking model based on the concept of itemset mining of frequent terms. The prototype search engine that implements the proposed retrieval model is illustrated, and a preliminary evaluation on a real data set is discussed.

Paper Nr: 58
Title:

Smart Learning Management System Framework

Authors:

Yeong-Tae Song, Yuanqiong Wang, Sungchul Hong and Yong-Ik Yoon

Abstract: Thanks to modern networking technologies and advancement of social networks, people in the modern society need more and more information just to be in the game. With such environment, the importance of learning and information sharing cannot be overemphasized. Even though plethora of information is available on various sources such as the web, libraries, and any learning material repositories, if it is not readily available and meets the needs of the user, it may not be utilized. For that, we need a system that can help provide customized information – matches with user’s level and interest - to the user. Such system should understand what the user’s interests are, what level the user belongs for the topic, and so on. In this paper, we are proposing a framework for smart learning management system (SLMS) that utilizes user profiles and semantically organized learning objects so only the relevant information can be delivered to the user. The SLMS maintains user profiles – continuously updating whenever there is a change – and learning objects that are organized by building ontology. Upon user’s request, the system fetches relevant learning materials based on the user’s profile. The delivered learning materials are suitable for the user’s topic and the level for the requested topic sorted by relevancy ranking.

Posters
Paper Nr: 30
Title:

Ontology Similarity Measurement Method in Rapid Data Integration

Authors:

Juebo Wu, Chen-Chieh Feng and Chih-Yuan Chen

Abstract: Rapid data integration has been a challenging topic in the field of computer science and its related subjects, widely used in data warehouse, artificial intelligence, biological medicine, and geographical information system etc. In this paper, we present a method of ontology similarity measurement in rapid data integration, by means of semantic ontology from high level perspective. The edit distance algorithm is introduced as the basic principle for ontology similarity calculation. A case study is carried out and the result shows that the presented method is feasible and effective.

Paper Nr: 43
Title:

Ontology-engineered MSME Framework

Authors:

M. Saravanan, M. Amirtha Varshini and Brindha S.

Abstract: Micro, Small and Medium scale Enterprises (MSMEs) hold an unfailing distinction of being pillars of equitable economic growth. Lack of proper business platforms and knowledge of marketing strategies render MSMEs vulnerable to middlemen exploitation. The Web has extended e-business platforms, e-commerce and micro financing solutions to assist MSMEs. These web-based solutions fail to obliterate intermediation. In view of the advancements and customers’ growth in the telecommunications field, we utilize the mobile platform to offer trading solutions to MSMEs. In this paper, we propose a mobile phone-based ontology engineered framework for MSMEs that can achieve disintermediation. The framework has been tested on mobile cloud akin to EC2 cloud environment and integrated with an android application that provides easy access - anytime, anywhere. The envisioned framework will boost MSME margins, build healthy business-ties and transform MSMEs into self-sufficient establishments equipped with full-fledged trading systems that operate in mobile phone enabled environment.

Paper Nr: 51
Title:

Create a Specialized Search Engine - The Case of an RSS Search Engine

Authors:

Robert Viseur

Abstract: Several approaches are possible for creating specialized search engines. For example, you can use the API of existing commercial search engines or create engine from scratch with reusable components such as open source indexer. RSS format is used for spreading information from websites, creating new applications (mashups), or collecting information for competitive or technical watch. In this paper, we focus on the study case of an RSS search engine development. We identify issues and propose ways to address them.