Student Research Forum 2014
Student Research Forum
March 31 and April 1, 2014
COMPUTER SCIENCE AND COMPUTATIONAL SCIENCE
MONDAY, MARCH 31, 2014, EN-2022
Session Chair: Dr. Minglun Gong
Head, Computer Science: Dr. Wolfgang Banzhaf
Director, Computational Science: Dr. Martin Plumer
0900 Opening Remarks
BREAK (20 minutes)
TUESDAY, APRIL 1, 2014, EN-2022
Session Chair: Dr. Minglun Gong
Head, Computer Science: Dr. Wolfgang Banzhaf
Director, Computational Science: Dr. Martin Plumer
0900 Opening Remarks
BREAK (20 minutes)
LUNCH (1 hour)
1550 Closing Remarks
MONDAY, MARCH 31, 2014
0910 Hemanth Billipati, M.Sc. Computer Science
Title: Durability in Main Memory Database
Abstract: Main memory database (MMDB) is a database management systemthat stores data entirely in main memory. Currently, most of the industries are moving towards MMDB to reduce the query processing time and minimize the I/O operations to disk. In disk-base database the stable disk copy ensures durability, but it fails in main memory due to its volatile nature (i.e. data will be lost if system gets crashed). System crash might leave system in a complicated or unrecoverable state. In this presentation, I am going to discuss about the recovery problems and the approach of algorithms towards back up and recovery of the database.
* * *
0930 Wenbin Zhang, M.Sc. Computer Science
Title: Feature Selection in Mixture Cure Model for Prognosis in Colon Cancer
Abstract: In recent years, breakthroughs in cancer research have made a wealth of data in which the number of features exceeds the number of observations available. In this case, one might be interested in (a) identifying features in predicting the cure status and risk of recurrence, and (b) developing a multivariate model that can be used to predict survival in a new observation. The mixture cure model has been recently used to address the latter problem. However, no specialized feature selection for the cure model is available yet. This talk presents a feature selection approach with correlation for the cure model, along with the expectation-maximization algorithm as estimation. We apply the proposed method in the analysis of molecular genetic prognostic factors for disease-free and time to disease recurrence in a cohort of patients with colon cancer. Simulation will be conducted to evaluate the performance of the proposed approach.
* * *
0950 Tamkin Avi, M.Sc. Computer Science
Title: Glider Mission Planning Using Generic Solvers
Abstract: “Autonomous underwater vehicles (AUV), in particular gliders, have been used for many tasks such as underwater surveys. During a mission, several factors impact their performance: ocean currents, lack of communication and GPS while under water, weather. At present engineers do not seem to do much mission planning: they just give a fixed list of waypoints to an AUV before sending it on a mission. Developing techniques for planning such missions for possibly heterogeneous groups of gliders is the goal of this project. In this talk, I will present a software application for the glider mission planning problem. At the heart of such mission planning are variants of the Asymmetric Travelling Salesman Problem (ATSP): as it is NP-hard, there is no known algorithm to solve it exactly in polynomial time. Instead, we experimented with a number of heuristic solvers - Integer Linear Programming (CPLEX) or Saisfiability-based Pseudo-Boolean (Sat4j, Clasp, SCIP) as a core engine of our planner. Specifically, our software generates encodings of ATSP on the user-provided ocean data graph and goal points, then envokes different solvers to compute an optimal order of goal points to visit. We then analyze the performance of a variety of state-of-the-art solvers on different encodings of the mission planning problem."
* * *
1010 Mengzhou Wu, M.Sc. Computer Science
Title: Qualitative Navigation System for Autonomous Underwater Vehicle
Abstract: The AUV (Autonomous Underwater Vehicle) is a small size unmanned vehicle working underwater. Since it does not require human controlling after launch, it becomes one of the hottest research topics of the Ocean Engineering in recent years. Traditionally, the AUV is navigated by both of INS (Inertial Navigation System) and GPS (Global Positioning System). When the AUV is working underwater, where no GPS signal can reach, only the INS provides the position data through dead-reckoning, which leads accumulating error. In order to correct the drift caused by INS, the AUV surfaces regularly to receive the GPS data. This requirement restricts the AUV's longest continuous working time.
The QNS system developed by MERLIN (Marine Environmental Research Laboratory for Intelligent Vehicles) lab of Memorial University of Newfoundland is an innovative method to navigate the AUV, which sidesteps the GPS restriction. The QNS system deploys pre-collected seabed sonar images as a map of the underwater environment. While the AUV is working, it scans the seabed sonar images and compare them with the collected one. By comparing the similarity between the pre-collected and on-time collected images, the QNS estimates the AUV's position.
* * *
1030 Yiming Qian, M.Sc. Computer Science
Title: Data Classification with One-class Support Vector Machines
Abstract: The Support Vector Machine (SVM) is one of the popular batch learning methods for classification due to its high accuracy. However, for binary classification, binary SVM algorithm only train one SVM model depicting one separation hyperplane, which may not provide sufficient confidence on labeling ambiguous data. This talk presents a SVM classifier towards the drawback of SVM. The key idea is to solve both binary and multi-class classification problems using multiple competing one-class support vector machines (C-1SVMs). It utilizes the advantage of online learning, producing a partially trained model immediately, which is then gradually refined toward the final solution. It can also achieve faster convergence. Real-world problems of foreground segmentation and boundary matting for live videos in Computer Vision have been solved to demonstrate the effectiveness of C-1SVMs. The results are shown to be particularly competent at processing a wide range of videos with complex backgrounds from freely moving cameras.
* * *
1050 Feng Wu, M.Sc. Computer Science
Title: Emergent Ontology in Tagging System
Abstract: With the advent of Big Data era, the characteristics of data tend to be more unstructured and heterogeneous. Metadata, especially ontologies, are essential for understanding, filtering, and integrating data in this scenario. While expert built ontologies are infeasible comparing with the sheer amount of available data, a bottom up approach may provide a unique solution for making sense of the data, for the tagging process mimics the human cognition process. However, folksonomies, or user generated tags, are often being criticized of lacking enough semantics, as the absent of semantic connections between tags renders them less useful as metadata.
In this presentation I will summarize and compare existing approaches for enriching tagging semantics. Then the concept of property precedence will be introduced to interpret the psychological implications of actors in the tagging process. Based on property precedence we may build the semantic relationship between tags, which as the further research direction will be briefly discussed.
* * *
1130 Abdullah-Al- Mamun, M.Sc. Computer Science
Title: Change Detection in Environmental Monitoring Data Streams
Abstract: There is so much data being collected now, and the new data arrives so fast that it is impossible to store all of it. Moreover, even if we could store it all, we may not have the time to scan it before making judgements. This is a new kind of setting in computing: processing a stream of data as opposed to static, multiple-access data. Data streams are temporally ordered, fast changing, massive, and potentially infinite. The data flows in and out of an observation platform continuously and with varying update rates. Typical examples of data streams include wireless sensor network traffic, telecommunications, on-line transactions in the financial market or retail industry, web click streams, video surveillance, and weather or environment monitoring. As these kinds of data are not normally stored in any kind of data repository, effective and efficient management and online analysis of data streams brings new challenges.
Many data streams evolve over time, presenting a fundamental issue in data stream mining. In case of data stream, the distribution that generates the examples changes over time, whereas most of the machine learning algorithms assume that examples are generated at random according to some stationary probability distribution. Thus change detection and adapting to a change in the learning process are one of the most challenging problems when learning from an evolving data stream. On the other hand, change detection is not an easy task, since a fundamental limitation exists: the design of a change detector is a compromise between detecting true changes and avoiding false alarms. In many cases, the data stream can show changes over time which can be used for understanding the nature of several applications such as detecting online changes from real-time surveillance data.
Here, we have considered several existing methods that were proposed to detect both concept drift (gradual change) and concept shift (sudden change) in data stream. In particular, in collaboration with a local company, we are looking at the data streams generated from real-time environment monitoring system. Within this collaboration, we are planning to adapt existing methods and develop novel change detection algorithms in the context of environmental sensor data monitoring and processing.
* * *
1150 Sri Raghava Sudana, M.Sc. Computer Science
Title: Infrastructure Automation Tools
Abstract: Infrastructure in computers refers to the creation and organization of servers in a physical data center or in a cloud computing environment. Previously, server creation and installation of software had to be done manually. You had to be on the server (either through remote desktop or an SSH connection) to install any software. Changes on the server were made manually too. When the number of servers is scaled to a large number, manual deployment of servers and manual changes in server is time consuming and the manual method is more prone to errors. Infrastructure automation tools help you in deploying the servers and configuring the servers by changing “infrastructure into code”. In this presentation, I am going to discuss about “chef” one of the tool build for infrastructure automation. I will briefly talk about the necessity of infrastructure automation, how chef addresses the problems associated with infrastructure automation and the work I have done in chef.
* * *
1210 Zequan Feng, M.Sc. Computer Science
Title: An Interactive Interface for Extracting Foreground Objects from Videos
Abstract: How to extract foreground objects from videos has been actively studied in both computer graphic and vision fields. Recently, an effective foreground segmentation algorithm has been proposed, which can segment objects with fuzzy boundaries from videos captured by moving camera in real-time. The core idea of this algorithm is to train both foreground and background classifiers based on user-provided examples and use the classifiers to label the remaining pixels. The objective of this project is to implement a friendly interface to allow users to select foreground and background examples interactively using scribbles and to provide instant feedback on the segmentation result under the current example selection. This allows users to quickly provide additional examples if necessary for generating satisfactory results. The interface is designed with touch screen in mind to allow users to draw scribbles over the images in an intuitive manner.
* * *
1230 Safwan Mohammad Mustafa, M.Sc. Computer Science
Title: Integrating Structured Data Using Property Precedence
Abstract: Data integration systems offer uniform access to a set of autonomous and heterogeneous data sources. One of the main challenges in data integration is reconciling semantic differences among data sources. Approaches that been used to solve this problem can be categorized as schema-based and attribute-based. Schema-based approaches use schema information to identify the semantic similarity in data; furthermore, they focus on reconciling types before reconciling attributes. In contrast, attribute-based approaches use statistical and structural information of attributes to identify the semantic similarity of data in different sources. This research will examine an approach to semantic reconciliation based on integrating properties expressed at different levels of abstraction or granularity using the concept of property precedence. Property precedence reconciles the meaning of attributes by identifying similarities between attributes based on what these attributes represent in the real world. In order to use property precedence for semantic integration, we need to identify the precedence of attributes within and across data sources. The goal of this research is to develop and evaluate a method and algorithms that will identify precedence relations among attributes and build property precedence graph (PPG) that can be used to support integration.
* * *
1250 Sheng Chieh Lin & Chengling Huang, M.Sc. Computer Science
Title: An IOS/Android Mobile Application for Personal Health Tracking
Abstract: Health tracking is considered to be one of the most important activities in our daily life. In this project, we mainly focus on developing a personalized mobile application for people to collect their daily food intakes, sleep quality, and pressure rating (psychological) data. By analyzing these daily statistical data, we can assess how food impacts our health. There are many applications on the market that serve the same common functions of health tracking (nutrition facts, weight, etc.). However, our application has two new interesting features i.e. sleep quality and pressure ratings. By incorporating these two new features, we found a certain number of daily intake calories increased along with the change of daily pressure and sleep quality. All of these data will be analyzed further using Machine Learning techniques in order to find their relationships for weight prediction.
* * *
Tuesday, April 1, 2014
910 Xin Lin, M.Sc. Computer Science
Title: Checking Recursive-Descent Conditions in Extended Context-Free Grammars
Abstract: A context-free grammar is a Chomsky grammar which has a single non-terminal symbol on the left-hand side and an arbitrary string of terminal and non-terminal symbols on the right hand side of all productions. An extended context-free grammar allows regular expression on the right hand sides of productions. Recursive descent parsing uses a set of recursive procedures which represent productions of an extended context-free grammar. However, recursive-descent approach can be used only if the underlying grammar is not left recursive and is conflict-free. The objective of this project is to write a program that reads an extended context-free grammar in a version of the BNF notation and checks if the grammar satisfies the condition of recursive-descent parsing.
* * *
0930 Khurram Shahzad, M.Sc. Computer Science
Title: Hadoop Framework for Mobile Systems
Abstract: Hadoop used for processing applications on large cluster of machines that are called nodes. It is an open source framework for distributed computing. It is dividing the application into chunk of tasks and execute each task in parallel on different machines. Hadoop also provides a distributed file system HDFS (Hadoop Distributed File System), which stores data on different nodes, and provides high aggregate bandwidth across them. The motivate of this research is to explore Hadoop framework for mobile environment, which will split and process multiple tasks on different mobile devices to gain their computational power. The complete implementation for Hadoop in mobile environment is out of scope. However, the research is motivated for exploring implementation as much as possible.
* * *
0950 Ben Fowler, Ph.D. Computer Science
Title: Modelling Evolvability for Faster Learning in Genetic Programming
Abstract: Genetic programming is a machine learning method inspired by biological evolution; a population of individual programs compete with one another to solve a problem, with those that are better at solving the problem are judged as more fit. Individuals mutate and recombine with other programs. The more fit individuals are more likely to be selected for the next generation of programs. In this manner, the population of programs gradually becomes more fit, thus eventually evolving a solution to the problem. Evolvability in genetic programming refers to the ability of an individual or population of programs to produce higher fitness individuals. It is not typically directly included in fitness evaluation. If measured and used to influence selection, evolvability generally produces more fit individuals in fewer generations than otherwise. However, properly measuring evolvability requires extensive computation, and the computational loss often exceeds the computational gains. I propose to model evolvability using statistics acquired through the genetic process. These statistics are used to build a model (using other machine learning methods) which can predict the evolvability of a program, which can then be used to enhance evolution, without the computational loss of measuring evolvability.
* * *
1010 Abdullah Ali Faruq, M.Sc. Computer Science
Title: Planning an Interesting Path
Abstract: For years, autonomous underwater vehicles (AUVs) such as Slocum gliders have been used for ocean survey, iceberg mapping and many other types of missions. As these types of AUVs are not able to obtain their location using GPS while under water, and strong currents can take an AUV far away from its planned course, researchers have been exploring other methods for allowing an AUV to determine its location.
In particular, Ralf Bachmayer and Brian Claus have been investigating using a sonar to track the distance to the bottom of the ocean, comparing the resulting profile to a stored map to determine the glider location. This approach works well if the ocean bottom is uneven, rich in unique features, however it does not work as well for flying over a vast uniform plane. This brings up the question of planning a path in such a way that it is "interesting", or "feature-rich" enough so that the glider does not get lost, yet also considers ocean currents and other constraints allowing the glider to arrive to its destination within a predefined time bound.
In this talk, we will define the resulting multivariate optimization problem and discuss possible approaches to designing algorithms for path planning in such setting.
* * *
1030 Scott Watson, Ph.D. Computer Science
Title: Automated Design for Playability in Computer Game Agents
Abstract: This presentation explores whether a novel approach to the creation of agent controllers has potential to overcome some of the drawbacks that have prevented novel controller architectures from being widely implemented. This is done by using an evolutionary algorithm to generate finite state machine controllers for agents in a simple role playing game. The concept of minimally playable games is introduced to serve as the basis of a method of evaluating the fitness of a game's agent controllers.
* * *
1050 Shanmei Liu, M.Sc. Computer Science
Title: Modelling Neurons in the Respiratory CPG of a Pond Snail
Abstract: Building single neuron model is the foundation of modelling a neural network and analyzing the desired behaviors mathematically. Among numerous existing single neuron models, the Hodgkin-Huxley type model gives a detailed quantitative description of the electrical events of action potential generation. This type of model, with respect to the structural and functional properties of ion channels, is closely linked to experimental data. In this presentation I will briefly talk about the HH mechanism of ion gating, and modeling, in the NEURON environment, the respiratory CPG of a pond snail, specifically the pacemaker neuron of the entire network.
* * *
1130 Mitu Debnath, M.Sc. Computer Science
Title: Policy-based Fine-grained Access Control in Online Social Networking (OSN) Platforms
Abstract: Online Social Networking systems such as Facebook, Linkedin, Twitter have changed the way users interact on the internet. Geocities was the first social networking site introduced in 1994. Today, Social Networking Services (SNSs) have become an integral part of modern society. Through this platform it is easy to share data among scientists, researchers, healthcare providers, and general public. Social networking platform has become a great tool to collaborate among cross-disciplinary personnel for rapid data sharing and communication. Like all other communication media, social media has its advantages and disadvantages too. Current social networks require users to place trust in their service providers, and the inability of service providers to protect users from malicious agents has led to sensitive private information being made public. We propose an architecture for online social networking platforms that protects users’ social information from both the service provider and other malicious users. In this research we will analyse the security and privacy issues in OSNs and will use efficient cryptographic techniques with fine-grained access policies suitable for OSNs to enhance data security and privacy.
* * *
1150 Hao Yuan, M.Sc. Computer Science
Title: Lable Noise Handling with One-class SVM
Abstract: The presence of noise in training set has strong impact on the performance of supervised learning (classification) techniques. In classification area, there are two types of noise: feature noise and label noise. Modeling and dealing with these noises are important; however, there is not much literature related to label noise. As the one-class SVM algorithm has advantage over standard binary SVM on handling ambiguous data, there is a strong indication that using One-class SVM can better handle label noise. In the research, we will analysis label noise, design one-class SVM based approach to deal with label noise, evaluate and compare our methods with other conventional methods, and apply this method to social media classification which we choose to solve Yahoo! large-scale Flickr-tag image classification grand challenge.
Key Words: Label noise, one-class SVM, social media, big data
* * *
1210 Mohammad Hizbul Bahar Arif, M.Sc. Computational Science
Title: Stochastic Climate Representation over North America
Abstract: I will present a computationally efficient stochastic Climate model to simulate large scale climate variations. In analogy with weather generators (a weather generator is a stochastic model that simulates daily weather based on the statistical characteristics of a local weather record), this can be thought of as a climate generator. Inputs for the climate generator include the temperature field from a simplified energy balance climate model, surface elevation, ice mask, and solar insolation. The climate generator then outputs mean monthly surface temperature and precipitation fields over North America using Bayesian artificial neural networks as nonlinear regressors. These outputs will then be used to drive the MUN Glacial Systems Model. I expect that through development of this climate generator and validation against General Circulation Model (FAMOUS model) output, the glacial systems model will be driven with a more realistic climate forcing.
* * *
1230 Amitesh Saha Shuva, M.Sc. Computer Science
Title: Calculating the Relative Degree Sequence (RDS) of Vertex-deleted Subgraph and Consideration of Possible Implementation of RDS
Abstract: The degree sequence of a vertex of graph G is the sequence obtained by listing the degrees of the neighbor of that vertex in non-decreasing order. The Relative degree sequence follows from all the induced subgraph of G. The relative degree sequence originating from the induced subgraph infact determines the graph G uniquely. In my presentation I will explain the process of calculating the RDS of all possible vertex-deleted subgraph. RDS is especially important for finding out whether two vertices in any graph, are connected or not. I will also try to explain other implementation of RDS in graph algorithms and its effectuation in reconstruction conjecture.
Key words: Relative Degree Sequence, Reconstruction Conjecture, vertex-deleted subgraph.
* * *
1250 Hadis Kakanejadifard, M.Sc. Computer Science & Somayeh Kafaie, Ph.D. Engineering
Title: Intelligent Educational Environment Using iBeacon
Abstract: In 2013, iBeacon was announced by Apple as a new technology which provides a higher level of location awareness. iBeacon is a built-in, cross-platform technology for Android and iOS devices which can utilize Bluetooth Low Energy (BLE) in indoor positioning. This technology has significant advantages in comparison to the other forms of indoor positioning technology, such as less expensive hardware, less energy drain, no need to internet connection, and capability to receive notification in background. This technology will bring huge change in future location awareness applications. It will change the way that retailers, event organizers, and educational institutions communicate with people indoors. For example, Apple is currently using iBeacons in its retail stores. A great application of iBeacon is in educational environment to be used in orientations, assist visitors, and to familiarize them with the environment. By applying this technology in the mobile applications, users not only are able to detect their current location in the environment, but also receive useful information about that specific location in a user friendly way. To implement this outstanding idea, we have developed a positioning application in educational environment and conducted some experiments in the Engineering building of Memorial University of Newfoundland with Beacons installed in three different location. The developed application is able to provide multimedia information about these locations while user is nearby.
Keywords: iBeacon, Location Awareness, Positioning 1840
* * *
1410 Zahra Sajedinia, M.Sc. Computer Science
Title: Computational Complexity Analysis of General Problem Solver
Abstract: In contemporary psychology, the human brain generally is considered as a finite system with limitations in space and time, and natural cognitive processes in the human brain are modeled as computational processes. This perspective makes computational complexity theory an appropriate tool for designing and evaluating cognitive models by identifying the aspects that are computationally unrealistic or implausible. The idea of using computational complexity in cognitive psychology for evaluating cognitive models led researchers to develop two theories. The first one is P-cognition thesis which is based on NP-completeness theory; And the second one is FPT-cognition thesis which uses parameterized computational complexity analysis.
In this research, using classical and parameterized computational complexity analysis, General Problem Solver (GPS), a cognitive model of human problem solving, will be analyzed. The analysis will be inspired by the similar previous work on AI planning models such as STRIPS planning. Related parameters from the psychology literature will be extracted and the sources of intractability in GPS will be determined. Finally, the results will be interpreted with respect to P-cognition and FPT-cognition theses. This study will benefit both computer science and psychology. It will benefit computer science by providing a detailed comparison between the GPS and STRIPS planning models, providing classical and parameterized complexity analyses on GPS and determining parameters which make GPS tractable or intractable. It will benefit psychology by providing a formal computational evaluation on GPS, verifying the parameters that experimentally have been shown to affect human problem solving. Furthermore, the results can be used to evaluate some cognitive architectures such as SOAR, and it will help cognitive psychologists to design more accurate models of human problem solving.
Keywords: Computational Complexity, Problem Solving, GPS(General Problem Solver), Parameterized Complexity, P-cognition Thesis, FPT-cognition Thesis
* * *
1430 Chen Zhang, Ph.D. Computer Science
Title: Network Coding for Opportunistic Multipath Routing in Wireless Network
Abstract: In this paper, we present a new forwarding paradigm that generalized opportunistic routing in wireless networks. We used multipath routing to find a group of nodes to opportunistic forward coded packets. Each node has a schedule based on the ETX and the value of multiplicity of coded packets. It is well known that network coding achieves high throughput in the face of lossy wireless links and reduces the duplicated packets in networks. Multipath routing guarantees the robustness of route to destination, the network coding mechanism reduces the duplicated packets in the network. Compared to previous work , our schedule gets the opportunistic gains and also increases the spatial reuse.
* * *
1450 Sahand Seifimamaghani, M.Sc. Computer Science
Title: Real-Time Registration of Highly Variant Colour+Depth Image Pairs
Abstract: The focus of this research is to develop algorithms to align color+depth image pairs taken from the same scene from different positions in real-time. Existing registration address this issue with image pairs that share most of the same scene and have small differences. Other algorithms can align image pairs with higher variation, but they do not perform in real-time. The registration technique proposed in this work uses a combination of 2D image feature detection algorithms and a false feature pair rejection method. It not only performs in real-time, but also supports large transformations with 6 degrees of freedom. Unlike the majority of available methods, the prototype of this technique also performs well when the image pairs have partial overlapping (50 percent or more).
* * *
1510 Esteban Ricalde, Ph.D. Computer Science
Title: Evolutionary Algorithm Approaches to the Traffic Signal Control Problem
Abstract: Urban traffic network control is a complex nonlinear problem. Traffic jams affect daily life of citizens. Furthermore, the rapid increase of metropolitan population makes the control of the traffic signals a more challenging task. Intelligent Transportation Systems recently emerged to meet this challenge.
This presentation gives a general description of the Traffic Signal Control Problem, explores different Intelligent Transportation Systems implemented to deal with it, with emphasis on Evolutionary Algorithm solutions, and presents a new approach based on Decision Trees and Genetic Programming.
* * *
1530 Naji Mahmoud, M.Sc. Computer Science
Title: ToolSecurity: Asset Tracking Solution for Industrial Purposes
Abstract: Asset management became an important factor for industries to achieve their goals and maintain presence in a competitive market. Oil and gas companies, for example, rely heavily on their assets to deliver their services, such companies need to keep track of huge volume of scattered assets throughout the company. The increased number of assets and the usage flexibility would normally contribute to poor tracking and major expenses for recovery.
An automated tracking solution is being developed to keep an accurate tracking of the assets, and therefore reduce recovery costs. The solution has the ability to search, identify and manage assets using handheld devices that are synchronized with the company’s assets database.
* * *
1550 Closing Remarks