| 2013-28 | | Other | Yujie He, Wenlin Chen, Yixin Chen | 5/12/2013 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/1016/kdd-2013.pdf | This paper introduces a supervised metric learning algorithm, called kernel density metric learning (KDML), which is easy to use and provides nonlinear, probability-based distance measures. KDML constructs a direct nonlinear mapping from the original input space into a feature space based on kernel density estimation. The nonlinear mapping in KDML embodies established distance measures between probability density functions, and leads to correct classification on datasets for which linear metric learning methods would fail. Existing metric learning algorithms, such as large margin nearest neighbors (LMNN), can then be applied to the KDML features to learn a Mahalanobis distance. We also propose an integrated optimization algorithm that learns not only the Mahalanobis matrix but also kernel bandwidths, the only hyper-parameters in the nonlinear mapping. KDML can naturally handle not only numerical features, but also categorical ones, which is rarely found in previous metric learning algorithms. Extensive experimental results on various benchmark datasets show that KDML significantly improves existing metric learning algorithms in terms of kNN classification accuracy. | wenlinchen@wustl.edu | | |
| 2013-27 | | MS Project Report | Andrew Shulman | 5/10/2013 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/1015/Shulman_project.pdf | If you compare two identical images on two different monitors, they will likely appear different. Every display device is supposed to adhere to a particular set of standards regulating the color and intensity of the image it outputs. However, in practice, very few do. Color calibration is the practice of modifying the signal path such that the colors produced more closely match reference standards. This is essential for graphics professionals who are mastering original content. They must ensure that the source material appears correct when viewed on a reference monitor. When viewed on a consumer panel, however, some error will appear on an uncalibrated panel. On low-end devices, calibration helps to maximize picture quality, producing an image comparable to a more expensive panel out of the box. On high-end devices, it can reduce color error to levels imperceptible to the human eye.
| | | |
| 2013-26 | | MS Thesis | Bo Li | 4/26/2013 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/1014/wcps-thesis-final.pdf | Wireless Structural Control (WSC) systems can play a crucial role in protecting civil infrastructure in the event of earthquakes and other natural disasters. Such systems represent an exemplary class of cyber-physical systems that perform close-loop control using wireless sensor networks. Existing WSC research usually employs wireless sensors installed on small lab structures, which cannot capture realistic delays and data loss in wireless sensor networks deployed on large civil structures. The lack of realistic tools that capture both the cyber (wireless) and physical (structural) aspects of WSC systems has been a hurdle for cyber-physical systems research for civil infrastructure. This advances the state of the art through the following contributions. First, we developed the Wireless Cyber-Physical Simulator (WCPS), an integrated environment that combines realistic simulations of both wireless sensor networks and structures. WCPS integrates Simulink and TOSSIM, a state-of-the-art sensor network simulator featuring a realistic wireless model seeded by signal traces. Second, we performed two realistic case studies each combining a structural model with wireless traces collected from real-world environments. The building study combines a benchmark building model and wireless traces collected from a multi-story building. The bridge study combines the structural model of the Cape Girardeau bridge over the Mississippi River and wireless traces collected from a similar bridge (the Jindo Bridge) in South Korea. These case studies shed light on the challenges of WSC and the limitations of a traditional structural control approach under realistic wireless conditions. Finally, we proposed a cyber-physical co-design approach to WSC that integrates a novel holistic scheduling scheme (for sensing, communication and control) and an Optimal Time Delay Controller (OTDC) that substantially improves structural control performance. | | | |
| 2013-25 | | Other | Abusayeed Saifullah, David Ferry, Jing Li, Kunal Agrawal, Chenyang Lu, and Christopher Gill | 4/23/2013 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/1013/dag_rep.pdf | Recently, multi-core processors have become mainstream in processor design. To take full advantage of multi-core processing, computation-intensive real-time systems must exploit intra-task parallelism. In this paper, we address the open problem of real-time scheduling for a general model of deterministic parallel tasks, where each task is represented as a directed acyclic graph (DAG) with nodes having arbitrary execution requirements. We prove processor-speed augmentation bounds for both preemptive and non-preemptive real-time scheduling for general DAG tasks on multi-core processors. We first decompose each DAG into sequential tasks with their own release times and deadlines. Then we prove that these decomposed tasks can be scheduled using preemptive global EDF with a resource augmentation bound of 4. This bound is as good as the best known bound for more restrictive models, and is the first for a general DAG model. We also prove that the decomposition has a resource augmentation bound of 4 plus a non-preemption overhead for non-preemptive global EDF scheduling. To our knowledge, this is the first resource augmentation bound for non-preemptive scheduling of parallel tasks. Finally, we evaluate our analytical results through simulations that demonstrate that the derived bounds are safe, and reasonably tight in practice, especially under preemptive EDF scheduling. | | | |
| 2013-20 | | PhD Dissertation | Minmin Chen | 4/9/2013 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/1012/ChenDissertation.pdf | The generalization properties of most existing machine learning techniques are predicated on the assumptions that 1) a sufficiently large quantity of training data is available; 2) the training and testing data come from some common distribution. Although these assumptions are often met in practice, there are also many scenarios in which training data from the relevant distribution is insufficient. We focus on making use of additional data, which is readily available or can be obtained easily but comes from a different distribution than the testing data, to aid learning. We present five learning scenarios, depending on how the distribution we used to sample the additional training data differs from the testing distribution: 1) learning with weak supervision; 2) domain adaptation; 3) learning from multiple domains; 4) learning from corrupted data; 5) learning with partial supervision. We introduce two strategies and manifest them in five ways to cope with the difference between the training and testing distribution. The first strategy, which gives rise to Pseudo Multi-view Co-training (PMC) and Co-training for Domain Adaptation (CODA), is inspired by the co-training [23] algorithm for multi-view data. PMC generalizes co-training to the more common single view data and allows us to learn from weakly labeled data retrieved free from the web. CODA integrates PMC with an another feature selection component to address the feature incompatibility between domains for domain adaptation. PMC and CODA are evaluated on a variety of real datasets, and both yield record performance. The second strategy marginalized dropout leads to marginalized Stacked Denoising Autoencoders (mSDA), Marginalized Corrupted Features (MCF) and FastTag (FastTag). mSDA diminishes the difference between distributions associated with different domains by learning a new representation through marginalized corruption and reconstruciton. MCF learns from a known distribution which is created by corrupting a small set of training data, and improves robustness of learned classifiers by training on "infinitely" many data sampled from the distribution. FastTag applies marginalized dropout to the output of partially labeled data to recover missing labels for multi-label tasks. These three algorithms not only achieve the state-of-art performance in various tasks, but also deliver orders of magnitude speed up at training and testing comparing to competing algorithms.
| | | |
| 2013-18 | | PhD Dissertation | Shakir James | 12/28/2012 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/1011/JamesDissertation.pdf | Data distribution applications transfer data les on computer networks. On these networks, peer-to-peer applications transfer bulk data quickly to many destinations. But existing applications increase network costs on the Internet and underuse network capacity on the data center network. To address these network ineciencies, this thesis introduces two applications. First, it proposes an Internet application that decreases the costly trac of peer-to-peer applications while maintaining their download times. This approach is novel because it works without changing existing peer-to-peer applications on the Internet. Second, after identifying the problem with existing applications in data centers, this thesis proposes a peer-to-peer application that increases its usage of the network capacity and maintains the speed of other applications. This approach is novel because it is more resilient than similar peer-to-peer applications on the data center network.
| | | |
| 2013-14 | | Other | William D. Richard, Ph.D. | 3/7/2013 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/1010/Efficient Parallel Upsampling of Ultrasound Vectors_Richard.pdf | Upsampling is required prior to the summation step in most receive digital beamforming implementations to produce an accurate summed RF line or vector. This is true in both annular and linear array systems where receive echos are digitized first and then time delayed in the digital domain to achieve proper signal alignment. The efficient, parallel, real-time upsampling circuit presented here produces M upsampled values per ADC clock, where M is the desired upsampling factor. A circuit implementation that upsamples by a factor of M=4 is presented as an example of the more general technique. | wdr@wustl.edu | | |
| 2013-10 | | Other | Peng Li, Kunal Agrawal, Jeremy Buhler, Roger D. Chamberlain | 2/20/2013 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/1009/add_data_parallel.pdf | The streaming model is a popular model for writing high-throughput parallel applications. A streaming application is represented by a graph of computation stages that communicate with each other via FIFO channels. In this report, we consider the problem of mapping streaming pipelines — streaming applications where the graph is a linear chain — in order to maximize throughput. In a parallel setting, subsets of stages, called components can be mapped onto different computing resources. The through-put of an application is determined by the throughput of the slowest component. Therefore, if some stage is much slower than others, then it may be useful to replicate the stage’s code and divide its workload among two or more replicas in order to increase throughput. However, pipelines may consist of some replicable and some non-replicable stages. In this paper, we address the problem of mapping these partially replicable streaming pipelines on both homogeneous and heterogeneous platforms so as to maximize throughput. We consider two types of platforms, homogeneous platforms — where all resources are identical, and heterogeneous platforms — where resources may have different speeds. In both cases, we consider two network topologies — unidirectional chain and clique. We provide polynomial-time algorithms for mapping partially replicable pipelines onto unidirectional chains for both homogeneous and heterogeneous platforms. For homogeneous platforms, the algorithm for unidirectional chains generalizes to clique topologies. However, for heterogeneous platforms, mapping these pipelines onto clique topologies is NP-complete. We provide heuristics to generate solutions for cliques by applying our chain algorithms to a series of chains sampled from the clique. Our empirical results show that these heuristics rapidly converge to near-optimal solutions. | | | |
| 2013-9 | | Other | Jing Li, Kunal Agrawal, Chenyang Lu, Christopher Gill Department of Computer Science and Engineering Washington University in St. Louis St. Louis, MO, USA | 2/16/2013 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/1008/ECRTS_dagedf_li.pdf | As multicore processors become ever more prevalent, it is important for real-time programs to take advantage of intra-task parallelism in order to support computation-intensive applications with tight deadlines. We prove that a Global Earliest Deadline First (GEDF) scheduling policy provides a capacity augmentation bound of 4-2/m and a resource augmentation bound of 2-1/m for parallel tasks in the general directed acyclic graph model. For the proposed capacity augmentation bound of 4-2/m for implicit deadline tasks under GEDF, we prove that if a task set has a total utilization of at most m/(4-2/m) and each task’s critical path length is no more than 1/(4-2/m) of its deadline, it can be scheduled on a machine with m processors under GEDF. Our capacity augmentation bound therefore can be used as a straightforward schedulability test. For the standard resource augmentation bound of 2-1/m for arbitrary deadline tasks under GEDF, we prove that if an ideal optimal scheduler can schedule a task set on m unit-speed processors, then GEDF can schedule the same task set on m processors of speed 2-1/m. However, this bound does not lead to a schedulabilty test since the ideal optimal scheduler is only hypothetical and is not known. Simulations confirm that the GEDF is not only safe under the capacity augmentation bound for various randomly generated task sets, but also performs surprisingly well and usually outperforms an existing scheduling technique that involves task decomposition. | li.jing@wustl.edu | | Real-Time Scheduling of Parallel Tasks: Theory and Practice at Washington University in St. Louis |
| 2013-2 | | Other | Jonathan C. Beard, Roger D. Chamberlain and Mark A. Franklin | 2/11/2013 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/1007/simplemodelstechreport.pdf | Modern hardware is inherently heterogeneous. With heterogeneity comes multiple abstraction layers that hide underlying complex systems. While hidden, this complexity makes quantitative performance modeling a difficult task. Designers of high-performance streaming applications for heterogeneous systems must contend with unpredictable and often non-generalizable models to predict performance of a particular application and hardware mapping. This paper outlines a computationally simple approach that can be used to model the overall throughput and buffering needs of a streaming application on heterogeneous hardware. The model presented is based upon a hybrid maximum flow and decomposed discrete queueing model. The utility of the model is assessed using a set of real and synthetic benchmarks with model predictions compared to measured application performance. | jbeard@wustl.edu | | |
| 2013-1 | | MS Project Report | Xiang Zhou | 1/8/2013 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/1006/Project_XiangZhou.pdf | Recent advances in High-Throughput Sequencing (HTS) technology have greatly facilitated the researches in bioinformatics field. With the ultra-high sequencing speed and improved base-calling accuracy, Illumina Genome Analyzer is currently the most widely used platform in the field. To use the raw reads generated from the sequencing machine, the 3’ adapter sequence attached to the real read in the process of ligation needs to be correctly trimmed. This is often done by some inhouse scripts or different packages with various parameters. They either use the Smith-Waterman algorithm or search for an exact match of the 3’ adapter sequence. In this report, I investigated methodologies as well as the strengths and weaknesses of five representative mainstream adapter trimming tools in order to suggest a direction for other researchers. Furthermore, four sets of detailed analysis were performed to evaluate the performances of these tools. I demonstrated that my adapter trimming method is flexible, accurate and efficient for Next Generation Sequencing (NGS) analysis. | | | |
| 2012-85 | | MS Project Report | Donald McCurdy | 12/27/2012 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/1005/McCurdy_project.pdf | In this project, I present an application to view, interact with, and search 3D medical volumes as part of the GeneAtlas project. | | | |
| 2012-84 | | MS Project Report | Collin Foster | 12/20/2012 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/1004/Collin_Project.pdf | Recent breakthroughs in nanofabrication techniques have led to development of sophisticated Division-of-Focal-Plane (DoFP) polarization imaging sensors. One such technique allows the fabrication of nanowire filters fabricated directly on the imaging sensor itself. This technique can be used to fabricate robust DoFP polarization imaging sensors. However, the polarization information captured by the imagers can be degraded due to imperfections in the fabrication of the nanowire filters on the imaging sensor. Polarization information can also be degraded from other sources including crosstalk between pixels. To compensate for these undesired effects, a calibration routine can be applied to each pixel after image capture. In this project, we implement a calibration routine as a step in the pipeline processing of captured polarization images. The polarization image processing and calibration are implemented on an FPGA development board to achieve a real-time response to image captures. | | | |
| 2012-83 | | MS Project Report | Melynda Eden | 12/11/2012 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/1003/MelyndaEden.pdf | Cerebral palsy is a group of neurological disorders that impair body movement, muscle coordination, hearing, vision, and cognitive function. Symptoms vary but can include muscle weakness, muscle and joint tightness, abnormal or unsteady gait, seizures, learning disabilities, speech problems, as well as hearing or vision problems [1]. Although cerebral palsy cannot be cured, treatments such as physical and occupational therapy can greatly help affected children develop motor skills needed to increase mobility and foster independence [2]. Computer based therapy games have shown promise in helping stroke survivors recover from stroke [3]. Initially, stroke therapy games developed in Looking Glass utilized Nintendo Wii remotes (informally known as Wiimotes) to sense user’s movements. Challenges unfolded with stroke patients who were unable to grasp Wiimotes, thereby limiting and inhibiting game development and the user experience [3]. In this paper, I describe my efforts to integrate the Microsoft Kinect with Looking Glass and build therapy games that utilize the Kinect to track user movements. I detail the Kinect integration and discuss its advantages of seated skeletal tracking with no hand held devices required by the user. | | | |
| 2012-82 | | MS Project Report | David Shelley | 12/7/2012 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/1002/ShelleyDavid.pdf | Most high-fidelity digital cameras currently available obtain their images using technology where individual pixels can only acquire a single color. Since the acquisition of multiple colors is necessary to capture a full colored image, picture resolution is lost due to difficulties in interpolation between non-adjacent, same color pixels. The resulting unsharp images create the need to find a new way to obtain these images without requiring interpolation between adjacent pixels. An innovative method to capture images with individual pixel cells consisting of three layers of photodetectors stacked vertically upon one another has been created to rectify this problem, but no complete camera system has been fabricated to date. This project integrates a stacked photo detector imager chip into a complete camera, allowing full functionality and control of the photo detector chip to a personal computer. A six layer printed circuit board is fabricated using Cadence Allegro, firmware is written and compiled using Xilinx ISE and software is written in Microsoft Visual Studio 2010.
| | | |
| 2012-76 | | MS Project Report | Mahesh Mahadevan | 11/27/2012 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/1001/Mahadevan_project.pdf | Instrumentation and log data monitoring are important features present in many different applications today that are used in analysis of a system behaviour. They are largely used in collection of key information from the system which might allow us to describe or establish certain facts about the behavior of the system. Instrumentation in any application usually comes along with its overheads. In some cases , code for instrumentation might become more expensive on the underlying resources than the application itself. For many applications, especially real-time applications , this overhead can cause serious and often catastrophic system failures. Collecting useful information from a real-time application has to be without interfering with the performance of the application. In this paper , I describe a mechanism of collection and visualization of performance data from a real-time concurrency platform in linux , without interfering and ensuring the correctness of running real-time parallel tests and experiments. This paper describes two different forms of data collection 1) Collection of timing related information from various critical phases in a real-time experiment , 2) Collection of cpu preemption information on running experiment threads. Data collected from both these stages are further fed to a visualization tool for post-debugging analysis of real-time experiment. | | | |
| 2012-81 | | Other | Sisu Xi, Chong Li, Chenyang Lu, and Christopher Gill | 10/15/2012 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/1000/rtxenio.pdf | As computer hardware becomes increasingly powerful, there is an ongoing trend towards integrating complex, legacy real-time systems using fewer hosts through virtualization. Especially in embedded systems domains such as avionics and automotive engineering, this kind of system integration can greatly reduce system weight, cost, and power requirements. When systems are integrated in this manner, \emph{network communication} may become local \emph{inter-domain communication} (IDC) within the same host. This paper examines the limitations of inter-domain communication in Xen, a widely used open-source virtual machine monitor (VMM) that recently has been extended to support real-time domain scheduling. We find that both the VMM scheduler and the manager domain can significantly impact real-time IDC performance under different conditions, and show that improving the VMM scheduler alone cannot deliver real-time performance for local IDC. To address those limitations, we present the RTCA, a Real-Time Communication Architecture within the manager domain in Xen, along with empirical evaluations whose results demonstrate that the latency of communication tasks can be improved dramatically from ms to $\mu$s by a combination of the RTCA and a real-time VMM scheduler. | xisisu@gmail.com | | |
| 2012-64 | | Other | Lin Ma, Kunal Agrawal, Roger D. Chamberlain | 10/1/2012 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/999/TechnicalReport_ma.pdf | Many-core architectures are excellent in hiding memory-access latency by low-overhead context switching among a large number of threads. The speedup of algorithms carried out on these machines depends on how well the latency is hidden. If the number of threads were infinite, then theoretically these machines should provide the performance predicted by the PRAM analysis of the programs. However, the number of allowable threads per processor is not infinite. In this paper, we introduce the Threaded Many-core Memory (TMM) model which is meant to capture the important characteristics of these highly-threaded, many-core machines. Since we model some important machine parameters of these machines, we expect analysis under this model to give more fine-grained performance prediction than the PRAM analysis. We analyze 4 algorithms for the classic all pairs shortest paths problem under this model. We find that even when two algorithms have the same PRAM performance, our model predicts different performance for some settings of machine parameters. For example, for dense graphs, the Floyd-Warshall algorithm and Johnson’s algorithms have the same performance in the PRAM model. However, our model predicts different performance for large enough memory-access latency and validates the intuition that the Floyd-Warshall algorithm performs better on these machines. | lin.ma@cse.wustl.edu | Work submitted and accepted by ICPADS'2012 | http://www1.cse.wustl.edu/~lin.ma/ |
| 2012-62 | | Other | Abusayeed Saifullah | 9/21/2012 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/998/Saifullah.pdf | Self-stabilization is a theoretical framework of non-masking fault-tolerance for distributed networks. A self-stabilizing system is capable of tolerating any unexpected transient fault without outside intervention and, regardless of the initial state, it can converge to a legitimate global state, a predefined vector of local states, in finite time. Self-stabilization has rendered a good problem solving paradigm of networks over the last decade. In this paper, we survey the self-stabilizing solutions for various network optimization problems such as network flow, load balancing, load and resource distribution, routing, file distribution, shortest paths etc. The paper also summarizes some recent works presenting how the convergence of a self-stabilizing distributed network can be modelled as a convex optimization problem with the exploitation of an analogy between self-stabilizing systems and stable feedback systems. The works pertaining to gradient adaptation of self-stabilizing system are also presented. | saifullaha@cse.wustl.edu | | http://www.cse.wustl.edu/~saifullaha/ |
| 2012-56 | | Other | Abusayeed Saifullah | 9/21/2012 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/997/Self_stabilization.pdf | The notion of self-stabilization was first proposed by Dijkstra in 1974 in his classic paper. The paper defines a system as self-stabilizing if, starting at any, possibly illegitimate, state the system can automatically adjust itself to eventually converge to a legitimate state in finite amount of time and once in a legitimate state it will remain so unless it incurs a subsequent transient fault. Dijkstra limited his attention to a ring of finite-state machines and provided its solution for self-stabilization. In the years following his introduction, very few papers were published in this area. Once his proposal was recognized as a milestone in work on fault tolerance, the notion propagated among the researchers rapidly and many researchers in the distributed systems diverted their attention to it. The investigation and use of self-stabilization as an approach to fault-tolerant behavior under a model of transient failures for distributed systems is now undergoing a renaissance. A good number of works pertaining to self-stabilization in the distributed systems were proposed in the yesteryears most of which are very recent. This report surveys all previous works available in the literature of self-stabilizing systems. | saifullaha@cse.wustl.edu | | http://www.cse.wustl.edu/~saifullaha/ |
| 2012-55 | | PhD Dissertation | Xuefeng Zhou | 9/10/2012 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/996/zhou-dissertation.pdf | Post-transcriptional gene regulation at the RNA level has been recently shown to be more widespread and important than previously assumed. While various regulatory RNA molecules have been reported in animals and plants, two prominent types of regulatory small RNAs are microRNAs (miRNAs) and endogenous short interfering RNAs (siRNAs). Because of their importance, their nature and the difficulties in studying them, research in miRNAs has been an active research topic with many computational challenges. First, computational strategies for miRNA identification have been developed to overcome the technical hurdles for experimental methods based on expression screening. We propose and develop a novel ranking algorithm based on random walks to computationally predict novel miRNAs from genomes, which have a few known miRNAs, may be poorly annotation and even not completely assembled. We also develop meta-feature based classification method to identify miRNAs from high-throughput sequencing data of small RNAs. In addition, we devise a pipeline to analyze natsiRNA in high-throughput sequencing data. Secondly, we formulate the problem of promoter prediction based on multiple instance learning scheme, and propose an effective promoter identification algorithm, called CoVote. We apply CoVote to predict microRNA core promoters. We investigate core promoter regions of microRNA genes in Caenorhabditis elegans, Homo sapiens, Arabidopsis thaliana and Oryza sativa, and further analyze sequence motifs in the putative core promoters which may be involved in the transcription of microRNA genes. Furthermore, with characterized promoters of miRNA genes, we apply data mining approaches to model the transcriptome of miRNAs under particular conditions. Additionally, by integrating miRNA target genes, we further analyze the miRNA-mediate regulatory networks and computationally identify network motifs. Since miRNAs and their targets can be formulated as a natural bipartite network(graph), we propose and develop a tool to study modules in the miRNA-regulatory network. | | | |
| 2012-54 | | PhD Dissertation | Zachary Freudenburg | 9/7/2012 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/995/Freudenburg.pdf | Brain Computer Interfaces (BCI) systems which allow humans to control external devices directly from brain activity, are becoming increasingly popular due to dramatic advances in the ability to both capture and interpret brain signals. Further advancing BCI systems is a compelling goal both because of the neurophysiology insights gained from deriving a control signal from brain activity and because of the potential for direct brain control of external devices in applications such as brain injury recovery, human prosthetics, and robotics. The dynamic and adaptive nature of the brain makes it difficult to create classifiers or control systems that will remain effective over time. However it is precisely these qualities that offer the potential to use feedback to build on simple features and create complex control features that are robust over time. This dissertation presents work that addresses these opportunities for the specific case of Electrocorticography (ECoG) recordings from clinical epilepsy patients. First, queued patient tasks were used to explore the predictive nature of both local and global features of the ECoG signal. Second, an algorithm was developed and tested for estimating the most informative features from naive observations of ECoG signal. Third, a software system was built and tested that facilitates real-time visualizations of ECoG signal patients and allows ECoG epilepsy patients to engage in an interactive BCI control feature screening process. | | | |
| 2012-50 | | PhD Dissertation | Ross Sowell | 8/31/2012 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/994/Sowell.pdf | Magnetic resonance imaging (MRI) and computed tomography (CT) scanners have long been used to produce three-dimensional (3D) samplings of anatomy elements for use in medical visualization and analysis. From such data sets, physicians often need to construct surfaces representing anatomical shapes in order to conduct treatment, such as irradiating a tumor. Traditionally, this is done through a time-consuming and error-prone process in which an experienced scientist or physician marks a series of parallel contours that outline the structures of interest. Recent advances in surface reconstruction algorithms have led to methods for reconstructing surfaces from nonparallel contours that could greatly reduce the manual component of this process. Despite these technological advances, the segmentation process has remained unchanged. This dissertation takes the first steps toward bridging the gap between the new surface reconstruction technologies and bringing those methods to use in clinical practice. We develop VolumeViewer [68], a novel interface for modeling surfaces from volume data by allowing the user to sketch contours on arbitrarily oriented cross-sections of the volume. We design the algorithms necessary to support nonparallel contouring, and we evaluate the system with medical professionals using actual patient data. In this way, we begin to understand how nonparallel contouring can aid the segmentation process and expose the challenges associated with a nonparallel contouring system in practice. | | | |
| 2012-45 | | Other | Ming Zou, Tao Ju, and Nathan Carr | 7/2/2012 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/993/pringleSGP.pdf | Triangulation of 3D polygons is a well studied topic of research. Existing methods for finding triangulations that minimize given metrics (e.g., sum of triangle areas or dihedral angles) run in a costly $O(n^4)$ time \cite{Barequet95,Barequet96}, while the triangulations are not guaranteed to be free of intersections. To address these limitations, we restrict our search to the space of triangles in the Delaunay tetrahedralization of the polygon. The restriction allows us to reduce the running time down to $O(n^2)$ in practice ($O(n^3)$ worst case) while guaranteeing that the solutions are intersection free. We demonstrate experimentally that the reduced search space is not overly restricted. In particular, triangulations restricted to this space usually exist for practical inputs, and the optimal triangulation in this space approximates well the optimal triangulation of the polygon. This makes our algorithms a practical solution when working with real world data. | taoju@cse.wustl.edu | Work submitted to but not accepted by SGP'12. | |
| 2012-44 | | PhD Dissertation | Huang-Ming Huang | 6/26/2012 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/992/thesis-main_Huang.pdf | Traditional fixed-priority scheduling analysis for periodic/sporadic task sets is based on the assumption that all tasks are equally critical to the correct operation of the system. Therefore, every task has to be schedulable under the scheduling policy, and estimates of tasks’ worst case execution times must be conservative in case a task runs longer than is usual. To address the significant under-utilization of a system’s resources under normal operating conditions that can arise from these assumptions, several mixed-criticality scheduling approaches have been proposed. However, to date there has been no quantitative comparison of system schedulability or run-time overhead for the different approaches. In this dissertation, we present what is to our knowledge the first side-by-side implementation and evaluation of those approaches, for periodic and sporadic mixed-criticality tasks on uniprocessor or distributed systems, under a mixed-criticality scheduling model that is common to all these approaches. To make a fair evaluation of mixed-criticality scheduling, we also address some previously open issues and propose modifications to improve schedulability and correctness of particular approaches. To facilitate the development and evaluation of mixed-criticality applications, we have designed and developed a distributed real-time middleware, called MCFlow, for mixed-criticality end-to-end tasks running on multi-core platforms. The research presented in this dissertation provides the following contributions to the state of the art in real-time middleware: (1) an efficient component model through which dependent subtask graphs can be configured flexibly for execution within a single core, across cores of a common host, or spanning multiple hosts; (2) support for optimizations to inter-component communication to reduce data copying without sacrificing the ability to execute subtasks in parallel; (3) a strict separation of timing and functional concerns so that they can be configured independently; (4) an event dispatching architecture that uses lock free algorithms where possible to reduce memory contention, CPU context switching, and priority inversion; and (5) empirical evaluations of MCFlow itself and of different mixed criticality scheduling approaches both with a single host and end-to-end across multiple hosts. The results of our evaluation show that in terms of basic distributed real-time behavior MCFlow performs comparably to the state of the art TAO real-time object request broker when only one core is used and outperforms TAO when multiple cores are involved. We also identify and categorize different use cases under which different mixed criticality scheduling approaches are preferable.
| | | |
| 2012-42 | | Other | Michelle Vaughan, Cindy Grimm, Ruth West, Ross Sowell, Robert Pless and Stephen Kobourov | 6/14/2012 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/991/MG_CSSeg-sub.pdf | Segmentation of 3D and time-varying volumetric (4D) image data is considered a time and resource intensive bottleneck in scientific endeavors. Automatic methods are becoming more reliable, but many data sets still require manual intervention. This can mainly be attributed to the characteristics of the image data not being amenable to automated methods, the existence of variations in or poor image quality, or the need for an expert to review and edit results from an automatic technique. Manually segmenting volumetric data is a challenge even for those more experienced. Understanding the 3D nature of the data and navigating through the 3D environment poses some of the main difficulties of the task. Understanding what it means to segment data, and what a contour and a set of contours portrays is key to producing meaningful and usable results. Our goal is to construct a system that will allow even novices to segment 3D data and produce results that experts and scientists can review and use. Through the use of fe tures such as tutorials, contouring protocols, navigation aids, and other interface enhancements, we believe that we can create a workflow that will allow any level of user to work with and understand 3D data. We propose a guided segmentation system that (1) lets experts create data specific segmentation aids, (2) helps users create a meaningful segmentation using these aids, and (3) allows the experts to use the results to provide helpful information to scientists. | | | |
| 2012-41 | | MS Project Report | Devorah Langsam | 5/8/2012 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/990/DevorahLangsam.pdf | Capturing imagery from outdoor cameras provides a large amount of information about a scene. The true surface appearances of elements in a scene, however, are often incorrectly represented in images. To get a better representation of the scene it is necessary to separate the effects of the underlying reflectance, illumination, and fog in the image. The goal of the dark channel prior is to eliminate the effects of haze in outdoor images and recover the true surface reflectance image for the scene.
| | | |
| 2012-40 | | MS Project Report | Roger Alessi | 5/8/2012 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/989/Roger Alessi.pdf | Circuit boards with Field Programmable Gate Arrays (FPGAs) have a historically diverse set of standards for communicating with other devices. This provides a challenge for developers creating FPGA applications and makes migration of applications from one board or FPGA to another difficult. Many board manufacturers, including Xilinx and GiDEL, create boards with Peripheral Component Interconnect Express (PCIe) buses. PCIe provides a low-level standard for transferring data between a Central Processing Unit (CPU) and an FPGA, and these manufacturers have designed Direct Memory Access (DMA) engines to implement the standard. However, each manufacturer’s DMA engines do not share a standard interface, leading to the same difficulties switching between boards or FPGAs. The goal of this project is to reach a common interface for the diverse set of available DMA engines to improve application development productivity, particularly for the Autopipe environment, an environment for developing multi-device streaming applications.
| | | |
| 2012-37 | | MS Project Report | Jed Jackoway | 5/1/2012 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/988/Jackoway_Project.pdf | The goal of the project was to reconstruct the skeleton of a Microsoft Kinect user’s hand. Out of the box, Kinect reconstruct the skeleton of users’ bodies, but it only does large joints, such that the hand is given a location on the general skeleton, but the specifics of the fingers and fist are not actually calculated.
| | | |
| 2012-36 | | MS Project Report | Alex Drake | 5/1/2012 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/987/Kinect Hand Recognition and Tracking.pdf | The goal of this research project was to be able to identify and track a hand using the depth image from a Microsoft Kinect. The ability to do this would have uses in sign language recognition, rehabilitation, and gesture recognition amongst others. The Microsoft Kinect is currently capable of identifying and tracking whole bodies. Our approach was to follow the method that Microsoft used but apply it to detailed hand recognition instead of the body. The basic strategy they used was to take a large amount of labeled depth data and build a decision tree to identify which part of the body each pixel of an image belongs to.
| | | |
| 2012-35 | | PhD Dissertation | Brian Haynes | 5/1/2012 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/986/BrianHaynes.pdf | This dissertation addresses a current outstanding problem in the field of systems biology, which is to identify the structure of a transcriptional network from high-throughput experimental data. Understanding of the connectivity of a transcriptional network is an important piece of the puzzle, which relates the genotype of an organism to its phenotypes. An overwhelming number of computational approaches have been proposed to perform integrative analyses on large collections of high-throughput gene expression datasets to infer the structure of transcriptional networks. I put forth a methodology by which these tools can be evaluated and compared against one another to better understand their strengths and weaknesses. Next I undertake the task of utilizing high-throughput datasets to learn new and interesting network biology in the pathogenic fungus Cryptococcus neoformans. Finally I propose a novel computational method for mapping out transcriptional networks that unifies two orthogonal strategies for network inference. I apply this method to map out the transcriptional network of Saccharomyces cerevisiae and demonstrate how network inference results can complement chromatin immunoprecipitation (ChIP) experiments, which directly probe the binding events of transcriptional regulators. Collectively, my contributions improve both the accessibility and practicality of network inference methods.
| | | |
| 2012-31 | | MS Project Report | Garrison Prinslow | 4/30/2012 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/985/GarrisonPrinslow_MEngCSE_MastersProject.pdf | This project was motivated by two related ideas: what can be learned about the unique issues involved with developing cloud-based mobile applications; and, what application could be developed to evaluate these characteristics that would also be innovative and provide value to users. After vetting ideas for the latter objective, it was clear that a new digital coupon system could offer value and apply interesting technologies to computer science problems, such as shortest path calculation, position-aware authentication, game theory, and statistical reasoning. A brief summary of the motivation for developing the new digital coupon system follows.
| | | |
| 2012-28 | | MS Project Report | Zhicheng Yang | 4/27/2012 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/984/Yang_MastersProject.pdf | In this project, we continued Dr. Chipara's study, which developed an early warning system (EWS) to detect the vital signs of patients in order to help doctors to intervene in time [1]. Since the number of wards increased, the environment our system faced with became more complicated and our network became more sensitive. This project focused on finding reasons on the relays that didn't work and doing a reliability analysis on the network in one ward our study covered.
| | | |
| 2012-24 | | Other | Abusayeed Saifullah, Kunal Agrawal, Chenyang Lu, Christopher Gill | 4/9/2012 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/983/Notes-new-rtsj.pdf | This paper proposes some significant corrections in a recent work of Lakshmanan et al on parallel task scheduling. Lakshmanan et al have proposed a transformation of parallel tasks into sequential tasks, and have claimed a resource augmentation bound of 3:42 for partitioned deadline monotonic (DM) scheduling of the transformed tasks. We demonstrate that their analysis for resource augmentation bound is incorrect. We propose a different technique for task transformation that requires a resource augmentation bound of 5 for partitioned DM scheduling. | | | |
| 2012-19 | | PhD Dissertation | Benjamin Wun | 3/27/2012 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/981/Wun_thesis.pdf | High speed networking is a demanding task that has traditionally been performed in dedicated, purpose built hardware or specialized network processors. These platforms sacrifice flexibility or programmability in favor of performance. Recently, there has been much interest in using multi-core general purpose processors, which have the advantages of being easily programmable and upgradeable. The best way to exploit these new architectures for networking is an open question that has been the subject of much recent research. In this disseration, I explore the best way to exploit multicore general purpose processors for packet processing applications. This includes both new architectural organizations for the processors as well as changes to the systems software. I intend to demonstrate the efficacy of these techniques by using them to build an open and extensible network security and monitoring platform that can out perform existing solutions.
| | | |
| 2012-18 | | PhD Dissertation | Tom Erez | 3/27/2012 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/980/Erez_thesis.pdf | This dissertation presents algorithms that allow robots to generate optimal behavior from first principles. Instead of hard-coding every desired behavior, we encode the task as a cost function, and use numerical optimization to find action sequences that can accomplish the task. Using the theoretical framework of optimal control, we develop methods for generating autonomous motor behavior in high-dimensional domains of legged locomotion. We identify three foundational problems that limit the application of existing optimal control algorithms, and present guiding principles that address these issues. First, some traditional algorithms use global optimization, where every possible state is considered. This approach cannot be applied in continuous domains, where every additional mechanical degree of freedom exponentially increases the volume of state space. In order to sidestep this curse of dimensionality, we focus on trajectory optimization, which finds locally-optimal solutions while scaling only polynoimally with state dimensionality. Second, many algorithms of optimal control and reinforcement learning cannot be directly applied to continuous domains with contacts, due to the non-smooth dynamics. We present techniques of contact smoothing that enable the use of standard continuous optimization tools. Finally, domains of legged locomotion give rise to extremely non-convex optimization, which is riddled with local minima. This issue is addressed by using a shaping protocol (homotopy-continuation), whereby the solution of an easier variant of the problem is used as initial guess for a harder variant of the problem, gradually finding a solution to the original domain. In dynamic environments, effective behavior requires feedback control, which provides appropriate reaction to changing circumstance. We present tools that generate adaptive autonomous behavior through local optimization, and scale well to domains with over 60 state-space dimensions. We present a series of experiments that demonstrate robust and reactive motor behavior in domains of increasing complexity. First, a multi-joint swimmer which can follow a moving target, withstand violent perturbations, and avoid moving obstacles. Second, a one-legged hopper that can maintain its gait through both perturbations and morphological alteration. And finally, two bipeds: a planar walker, and a 3D runner that uses its arms to balance its gait.
| | | |
| 2012-17 | | PhD Dissertation | Terry Tidwell | 3/27/2012 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/979/tidwellThesis.pdf | Time utility functions offer a reasonably general way to describe the complex timing constraints of real-time and cyber-physical systems. However, utility-aware scheduling policy design is an open research problem. In particular, scheduling policies that optimize expected utility accrual are needed for real-time and cyber-physical domains. This dissertation addresses the problem of utility-aware scheduling for systems with periodic real-time task sets and stochastic non-preemptive execution intervals. We model these systems asMarkov Decision Processes. This model provides an evaluation framework by which different scheduling policies can be compared. By solving the Markov Decision Process we can derive value-optimal scheduling policies for moderate sized problems. However, the time and memory complexity of computing and storing value-optimal scheduling policies also necessitates the exploration of other more scalable solutions. We consider heuristic schedulers, including a generalization we have developed for the existing Utility Accrual Packet Scheduling Algorithm. We compare several heuristics under soft and hard real-time conditions, different load conditions, and different classes of time utility functions. Based on these evaluations we present guidelines for which heuristics are best suited to particular scheduling criteria. Finally, we address the memory complexity of value-optimal scheduling, and examine trade-offs between optimality and memory complexity. We show that it is possible to derive good low complexity scheduling decision functions based on a synthesis of heuristics and reduced-memory approximations of the value-optimal scheduling policy. | | | |
| 2012-16 | | Other | Abusayeed Saifullah, Paras Babu Tiwari, Bo Li, Chenyang Lu, Yixin Chen | 2/22/2012 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/978/reliable_scheduling_1.pdf | WirelessHART networks are gaining ground as a real-time communication infrastructure in industrial wireless control systems. Because wireless communication is often susceptible to transmission failures in industrial environments, it is essential to account for failures in the delay analysis for realtime flows between sensors and actuators in process control. WirelessHART networks handle transmission failures through retransmissions using dedicated and shared time slots through different paths in the routing graphs. While these mechanisms for handling transmission failures are critical for process control requiring reliable communication, they introduce substantial challenges to worst-case end-to-end delay analysis for real-time flows. This paper presents the first worst-case end-to-end delay analysis for periodic real-time flows in a WirelessHART network that takes into account transmission failures. The delay bounds can be used to quickly assess the schedulability of real-time flows for industrial wireless control applications with stringent requirements on both high reliability and network latency. Simulations based on the topologies of a wireless sensor network testbed consisting of 69 TelosB motes indicate that our analysis provides safe upper bounds of the end-to-end delays of real-time flows at an acceptable level of pessimism. | saifullah@wustl.edu | | http://www.cse.wustl.edu/~saifullaha/ |
| 2012-14 | | Other | Abusayeed Saifullah, David Ferry, Kunal Agrawal, Chenyang Lu, and Christopher Gill | 2/22/2012 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/977/dag_parallel_1.pdf | Due to their potential to deliver increased performance over single-core processors, multi-core processors have become mainstream in processor design. Computation-intensive real-time systems must exploit intra-task parallelism to take full advantage of multi-core processing. However, existing results in real-time scheduling of parallel tasks focus on restrictive task models such as the synchronous model where a task is a sequence of alternating parallel and sequential segments, and parallel segments have threads of execution that are of equal length. In this paper, we address a general model for deterministic parallel tasks, where a task is represented as a DAG with different nodes having different execution requirements. We make several key contributions towards both preemptive and non-preemptive realtime scheduling of DAG tasks on multi-core processors. First, we propose a task decomposition that splits a DAG into sequential tasks. Second, we prove that parallel tasks, upon decomposition, can be scheduled using preemptive global EDF with a resource augmentation bound of 4. This bound is as good as the best known bound for more restrictive models, and is the first for a general DAG model. Third, we prove that the decomposition has a resource augmentation bound of 4 plus a non-preemption overhead for non-preemptive global EDF scheduling. To our knowledge, this is the first resource augmentation bound for nonpreemptive scheduling of parallel tasks. Through simulations, we demonstrate that the achieved bounds are safe and sufficient. | saifullah@wustl.edu | | http://www.cse.wustl.edu/~saifullaha/ |
| 2012-6 | | Other | Yi Mao, Wenlin Chen, Yixin Chen, Chenyang Lu, Marin Kollef, Thomas Bailey | 2/21/2012 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/976/sigproc-sp.pdf | Clinical study found that early detection and intervention are essential for preventing clinical deterioration in patients, for patients both in intensive care units (ICU) as well as in general wards but under real-time data sensing (RDS). In this paper, we develop an integrated data mining approach to give early deterioration warnings for patients under real-time monitoring in ICU and RDS. Existing work on mining real-time clinical data often focus on certain single vital sign and specific disease. In this paper, we consider an integrated data mining approach for general sudden deterioration warning. We synthesize a large feature set that includes first and second order time-series features, detrended fluctuation analysis (DFA), spectral analysis, approximative entropy, and cross-signal features. We then systematically apply and evaluate a series of established data mining methods, including forward feature selection, linear and nonlinear classification algorithms, and exploratory undersampling for class imbalance. An extensive empirical study is conducted on real patient data collected between 2001 and 2008 from a variety of ICUs. Results show the benefit of each of the proposed techniques, and the final integrated approach significantly improves the prediction quality. The proposed clinical warning system is currently under integration with the electronic medical record system at Barnes-Jewish Hospital in preparation for a clinical trial. This work represents a promising step toward general early clinical warning which has the potential to significantly improve the quality of patient care in hospitals. | chen@cse.wustl.edu | | |
| 2012-5 | | Other | Cindy Grimm and Pushkar Joshi | 1/9/2012 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/975/JustDrawIt_SBIM2012_sub.pdf | We present a system for sketching in 2D to create 3D curves. The interface is light-weight, pen-based, and based on observations of how artists sketch on paper. | cmg@wustl.edu | | |
| 2012-3 | | Other | Erik Karulf, Marshall Strother, Parker Dunton, and William D. Smart | 1/7/2012 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/974/ride-tr.pdf | There is a growing need for robot control interfaces that allow a single user to effectively control a large number of mostly-autonomous robots. The challenges in controlling such a collection of robots are very similar to the challenges of controlling characters in some genres of video games. In this paper, we argue that interfaces based on elements from computer video games are effective tools for the control of large robot teams. We present RIDE, the Robot Interactive Display Environment, an example of such an interface, and give the results of initial user studies with the interface, which lend support to our claim. | wds@cse.wustl.edu | | |
| 2011-101 | | Other | Logan Stafman | 12/23/2011 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/970/Forest.pdf | Forest is an overlay network designed for large real-time distributed systems. In particular, we’re interested in virtual worlds that provide high-quality interaction over a constantly changing pattern of communication. Forest is suitable for applications in which many entities send data to a large set of constantly changing entities. By using tree-structured channels, we can create logically isolated private networks which support both unicast and multicast routing. In this paper, we will discuss the core components of Forest and measure the performance of the control elements of the network. We will also provide examples of control sequences and the roles played by control elements in those sequences to help maintain fast, reliable data delivery. | lstafman@wustl.edu | | |
| 2011-100 | | MS Project Report | Yan Li | 12/21/2011 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/969/YanLi_project.pdf | N/A | | | |
| 2011-99 | | MS Project Report | Adam Kraft | 12/6/2011 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/968/Kraft_Baseball_AR_1.pdf | As cellular phones grow faster and become equipped with better sensors, consumers are seeing a rise in augmented reality applications available on their mobile devices. Sports augmented reality is one area that is projected to grow over the next couple of years. We aim to build an application intended for use at a live baseball game. Our application optimizes a best-fit homography to locate the structure of the playing field. From there, we are able to visualize information and statistics directly onto the user's mobile device. This is accomplished in real time, requiring minimal input from the user. | | | |
| 2011-96 | | Other | Abusayeed Saifullah | 11/11/2011 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/967/conn_paper.pdf | Self-stabilization for non-masking fault-tolerant distributed system has received considerable research interest over the last decade. In this paper, we propose a self-stabilizing algorithm for 2-edge-connectivity and 2-vertex-connectivity of an asynchronous distributed computer network. It is based on a self-stabilizing depth-first search, and is not a composite algorithm in the sense that it is not composed of a number of self-stabilizing algorithms that run concurrently. The time and space complexities of the algorithm are the same as those of the underlying self-stabilizing depth-first search algorithm. | saifullaha@cse.wustl.edu | | http://www.cse.wustl.edu/~saifullaha/ |
| 2011-95 | | PhD Dissertation | Abdel-Karim Al-Tamimi | 8/1/2010 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/966/abdel.pdf | Video streaming traffic has been surging in the last few years, which has resulted in an increase of its Internet traffic share on a daily basis. The importance of video streaming management has been emphasized with the advent of High Definition (HD) video streaming, as it requires by its nature more network resources. In this dissertation, we provide a better support for managing HD video traffic over both wireless and wired networks through several contributions. We present a simple, general and accurate video source model: Simplified Seasonal ARIMA Model (SAM). SAM is capable of capturing the statistical characteristics of video traces with less than 5% difference from their calculated optimal models. SAM is shown to be capable of modeling video traces encoded with MPEG-4 Part2, MPEG-4 Part10, and Scalable Video Codec (SVC) standards, using various encoding settings. We also provide a large and publicly-available collection of HD video traces along with their analyses results. These analyses include a full statistical analysis of HD videos, in addition to modeling, factor and cluster analyses. These results show that by using SAM, we can achieve up to 50% improvement in video traffic prediction accuracy. In addition, we developed several video tools, including an HD video traffic generator based on our model. Finally, to improve HD video streaming resource management, we present a SAM-based delay-guaranteed dynamic resource allocation (DRA) scheme that can provide up to 32.4% improvement in bandwidth utilization.
| | | |
| 2011-93 | | Other | Abusayeed Saifullah, Chengjie Wu, Paras Babu Tiwari, You Xu, Yong Fu, Chenyang Lu, and Yixin Chen | 10/17/2011 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/965/Control_Optimization1.pdf | With the advent of industrial standards such as WirelessHART, process industries are now gravitating towards wireless control systems. Due to limited bandwidth in a wireless network shared by multiple control loops, it is critical to optimize the overall control performance. In this paper, we address the scheduling-control co-design problem of determining the optimal sampling rates of feedback control loops sharing a WirelessHART network. The objective is to minimize the overall control cost while ensuring that all data flows meet their end-to-end deadlines. The resulting constrained optimization based on existing delay bounds for WirelessHART networks is challenging since it is non-differentiable, non-linear, and not in closed-form. We propose four methods to solve this problem. First, we present a subgradient method for rate selection. Second, we propose a greedy heuristic that usually achieves low control cost while significantly reducing the execution time. Third, we propose a global constrained optimization algorithm using a simulated annealing (SA) based penalty method. Finally, we formulate rate selection as a differentiable convex optimization problem that provides a closed-form solution through a gradient descent method. This is based on a new delay bound that is convex and differentiable, and hence simplifies the optimization problem. We evaluate all methods through simulations based on topologies of a $74$-node wireless sensor network testbed. Surprisingly, the subgradient method is disposed to incur the longest execution time as well as the highest control cost among all methods. SA and the greedy heuristic represent the opposite ends of the tradeoff between control cost and execution time, while the gradient descent method hits the balance between the two. | saifullaha@cse.wustl.edu | | http://www.cse.wustl.edu/~saifullaha/ |
| 2011-89 | | Other | Huang-Ming Huang, Christopher Gill, Chenyang Lu | 10/14/2011 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/964/mc_tech_report2.pdf | Traditional fixed-priority scheduling analysis for periodic task sets is based on the assumption that all tasks are equally critical to the correct operation of the system. Therefore, every task has to be schedulable under the scheduling policy, and estimates of tasks’ worst case execution times must be conservative in case a task runs longer than is usual. To address the significant under-utilization of a system’s resources under normal operating conditions that can arise from these assumptions, three main approaches have been proposed: priority assignment, period transformation, and zero-slack scheduling. However, to date there has been no quantitative comparison of system schedulability or run-time overhead for the different approaches. In this paper, we present what is to our knowledge the first side-by-side evaluation of those approaches, for periodic mixed-criticality tasks on uniprocessor systems, under a mixed-criticality scheduling model that is common to all three approaches. To make a fair evaluation of zero-slack scheduling, we also address two previously open issues: how to accommodate execution of a task after its deadline, and how to account for previously unidentified forms of interference between mixed-criticality tasks. Our simulations show that while priority assignment and period transformation are most likely to be able to schedule a randomly selected task set, a small fraction of the task sets are schedulable only under the zero-slack approach. Our empirical evaluation demonstrates that user-space implementations of mechanisms to enforce period transformation and zero-slack scheduling can be achieved on Linux without kernel modification, with suitably low overhead for mixed-criticality real-time task sets. | | | |
| 2011-88 | | PhD Dissertation | Joseph Lancaster | 10/3/2011 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/963/thesis-Lancaster.pdf | Computer engineers are continually faced with the task of translating improvements in fabrication process technology (i.e., Moore’s Law) into architectures that allow computer scientists to accelerate application performance. As feature-size continues to shrink, architects of commodity processors are designing increasingly more cores on a chip. While additional cores can operate independently with some tasks (e.g. the OS and user tasks), many applications see little to no improvement from adding more processor cores alone. For many applications, heterogeneous systems offer a path toward higher performance. Significant performance and power gains have been realized by combining specialized processors (e.g., Field-Programmable Gate Arrays, Graphics Processing Units) with general purpose multi-core processors. Heterogeneous applications need to be programmed differently than traditional software. One approach, stream processing, fits these systems particularly well because of the segmented memories and explicit expression of parallelism. Unfortunately, debugging and performance tools that support streaming, heterogeneous applications do not exist. This dissertation presents TimeTrial, a performance measurement system that enables performance optimization of streaming applications by profiling the application deployed on a heterogeneous system. TimeTrial performs low-impact measurements by dedicating computing resources to monitoring and by aggressively compressing performance traces into statistical summaries guided by user specification of the performance queries of interest. | | | |
| 2011-86 | | Other | Abusayeed Saifullah, You Xu, Chenyang Lu, and Yixin Chen | 9/29/2011 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/962/NEW_SCHEDULABILITY.pdf | WirelessHART is a new standard specifically designed for real-time and reliable communication between sensor and actuator devices for industrial process monitoring and control applications. End-to-end communication delay analysis for WirelessHART networks is required to determine the schedulability of real-time data flows from sensors to actuators for the purpose of acceptance test or workload adjustment in response to network dynamics. In this paper, we map the scheduling of real-time periodic data flows in a WirelessHART network to real-time multiprocessor scheduling. We then exploit the response time analysis for multiprocessor scheduling and propose a novel method for the delay analysis that establishes an upper bound of the end-to-end communication delay of each real-time flow in a WirelessHART network. Simulation studies based on both random topologies and real network topologies of a 74-node physical wireless sensor network testbed demonstrate that our analysis provides safe and reasonably tight upper bounds of the end-to-end delays of real-time flows, and hence enables effective schedulability tests for WirelessHART networks. | saifullaha@cse.wustl.edu | | http://www.cse.wustl.edu/~saifullaha/ |
| 2011-82 | | PhD Dissertation | Gregory Hackmann | 9/20/2011 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/961/Hackmann.pdf | Wireless Sensor Networks (WSNs) consist of tens or hundreds of small, inexpensive computers equipped with sensors and wireless communication capabilities. Because WSNs can be deployed without xed infrastructure, they promise to enable sensing applications in environments where installing such infrastructure is not feasible. However, the lack of xed infrastructure also presents a key challenge for application developers: sensor nodes must often operate for months or years at a time from fixed or limited energy sources. The focus of this dissertation is on reusable power management techniques designed to facilitate sensor network developers in achieving their systems' required lifetimes. Broadly speaking, power management techniques fall into two categories. Many power management protocols developed within the WSN community target specific hardware subsystems in isolation, such as sensor or radio hardware. The first part of this dissertation describes the Adaptive and Robust Topology control protocol (ART), a representative hardware-specic technique for conserving energy used by packet transmissions. In addition to these single-subsystem approaches, many applications can benefit greatly from holistic power management techniques that jointly consider the sensing, computation, and communication costs of potential application configurations. The second part of this dissertation extends this holistic power management approach to two families of structural health monitoring applications. By applying a partially-decentralized architecture, the cost of collecting vibration data for analysis at a centralized base station is greatly reduced. Finally, the last part of this dissertation discusses work toward a system for clinical early warning and intervention. The feasibility of this approach is demonstrated through preliminary study of an early warning component based on historical clinical data. An ongoing clinical trial of a real-time monitoring component also provides important guidelines for future clinical deployments based on WSNs.
| | | |
| 2011-78 | | Other | Ezekiel Maier, Randall H Brown, and Michael R Brent
| 9/9/2011 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/960/Tech_Report_6-6_ejm.pdf | Determining the beginning and end positions of each exon in each protein coding gene within a genome can be difficult because the DNA patterns that signal a gene’s presence have multiple weakly related alternate forms and the DNA fragments that comprise a gene are generally small in comparison to the size of the genome. In response to this challenge, automated gene predictors were created to generate putative gene structures. N SCAN identifies gene structures in a target DNA sequence and can use conservation patterns learned from alignments between a target and one or more informant DNA sequences. N SCAN uses a Bayesian network, generated from a phylogenetic tree, to probabilistically relate the target sequence to the aligned sequence(s). Phylogenetic substitution models are used to estimate substitution likelihood along the branches of the tree. Although N SCAN’s predictive accuracy is already a benchmark for de novo HMM based gene predictors, optimizing its use of substitution models will allow for improved conservation pattern estimates leading to even better accuracy. Selecting optimal substitution models requires avoiding overfitting as more detailed models require more free parameters; unfortunately, the number of parameters is limited by the number of known genes available for parameter estimation (training). In order to optimize substitution model selection, we tested eight models on the entire genome including General, Reversible, HKY, Jukes-Cantor, and Kimura. In addition to testing models on the entire genome, genome feature based model selection strategies were investigated by assessing the ability of each model to accurately reflex the unique conservation patterns present in each genome region. Context dependency was examined using zeroth, first, and second order models. All models were tested on the human and D. melanogaster genomes. Analysis of the data suggests that the nucleotide equilibrium frequency assumption (denoted as πi) is the strongest predictor of a model’s accuracy, followed by reversibility and transition/transversion inequality. Furthermore, second order models are shown to give an average of 0.6% improvement over first order models, which give an 18% improvement over zeroth order models. Finally, by limiting parameter usage by the number of training examples available for each feature, genome feature based model selection better estimates substitution likelihood leading to a significant improvement in N SCAN’s gene annotation accuracy.
| | | |
| 2011-64 | | Other | Kunal Agrawal and Jim Sukha | 8/9/2011 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/959/hws.pdf | Cache-locality is an important consideration for the performance in multicore systems. In modern and future multicore systems with multilevel cache hierarchies, caches may be arranged in a tree of caches, where a level k cache is shared between Pk processors, called a processor group, and Pk increases with k. In order to get good performance, as much as possible, subcomputations that share more data should execute on processors which share a lower-level cache. Therefore, the number of cache misses in these systems depends on the scheduling decisions, and a scheduler is responsible for not just achieving good load-balance and low overheads, but also good cache complexity. However, these can be competing criteria. In this paper, we explore the tension between these criteria for online hierarchical schedulers. Formally, we consider a system with P processors, arranged in a multilevel hierarchy according to a hierarchy tree, where each of the P processors forms a leaf of the tree, and an internal node at level-k corresponds corresponds to a processor group. In addition, we assume that computations have locality regions, that represent parallel subcomputations that share data. Each locality region has a particular level, and the scheduler must ensure that a level-k locality region is executed by processors in the same level-k processor group, since they share a level k cache. Thus locality regions can improve cache performance. However, they may also impair load-balance and increase scheduling overheads since the scheduler must obey the restrictions posed by locality regions. In this paper, we present a framework of hierarchical computations, that is, computations with locality regions at multiple levels of nesting. We describe the hierarchical greedy scheduler, where each locality region is scheduled using a greedy scheduler which attempts to use as many processors as possible while obeying the restrictions posed by the locality regions. We derive a recurrence for the time complexity for a region in terms of its nested regions. We also describe how a more realistic hierarchical work-stealing scheduler can get the same bounds apart from constant factors for an important subclass of computations called homogenous computations. Finally, we also analyze the cache complexity of the hierarchical work-stealing scheduler for a system with a multilevel cache hierarchy. | kunal@cse.wustl.edu | | |
| 2011-62 | | Other | Abusayeed Saifullah, You Xu, Chenyang Lu, and Yixin Chen | 7/31/2011 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/958/Channel_WSN.pdf | Interference between concurrent transmissions can cause severe performance degradation in wireless sensor networks (WSNs). While multiple channels available in WSN technology such as IEEE 802.15.4 can be exploited to mitigate interference, channel allocation can have a significant impact on the performance of multi-channel communication. This paper proposes a set of distributed algorithms for near-optimal channel allocation in WSNs with theoretical bounds. We first consider the problem of minimizing the number of channels needed to remove interference in a WSN, and propose both receiver-based and link-based distributed channel allocation protocols. For WSNs with an insufficient number of channels, we formulate a fair channel allocation problem whose objective is to minimize the maximum interference (MinMax) experienced by any transmission link in the network. We prove that MinMax channel allocation is NP-hard and propose a distributed link-based MinMax channel allocation protocol. We also propose a distributed protocol for link scheduling based on MinMax channel allocation. Simulations based on real topologies and data traces collected from a WSN testbed consisting of 74 TelosB motes, and using random topologies have shown that our channel allocation protocols significantly outperform a state-of-the-art channel allocation protocol. | saifullaha@cse.wustl.edu | | |
| 2011-61 | | Other | Mo Sha, Gregory Hackmann, Chenyang Lu | 4/9/2012 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/957/aedp.pdf | Low Power Listening (LPL) is a common MAC-layer technique for reducing energy consumption in wireless sensor networks, where nodes periodically wake up to sample the wireless channel to detect activity. However, LPL is highly susceptible to false wakeups caused by environmental noise being detected as activity on the channel, causing nodes to spuriously wake up in order to receive nonexistent transmissions. In empirical studies in residential environments, we observe that the false wakeup problem can significantly increase a node's duty cycle, compromising the benefit of LPL. We also find that the energy-level threshold used by the Clear Channel Assessment (CCA) mechanism to detect channel activity has a significant impact on the false wakeup rate. We then design AEDP, an adaptive energy detection protocol for LPL, which dynamically adjust a node's CCA threshold to improve network reliability and duty cycle based on application-specified bounds. Empirical experiments in both controlled tests and real-world environments showed AEDP can effectively mitigate the impact of noise on radio duty cycles, while maintaining satisfactory link reliability. | | | |
| 2011-59 | | Other | Jeremy Buhler, Kunal Agrawal, Peng Li, Roger D. Chamberlain | 7/24/2011 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/956/balc11.pdf | In this report, we show that deadlock avoidance for streaming computations with filtering can be performed efficiently for a large class of DAG topologies. We first give efficient algorithms for dummy interval computation in series-parallel DAGs, then generalize our results to a larger graph family, the CS4DAGs, in which every undirected cycle has exactly one source and one sink. Our results show that, for a large set of application topologies that are both intuitively useful and formalizable, the streaming model with filtering can be implemented safely with reasonable compilation overhead. | pengli@wustl.edu | | |
| 2011-57 | | Other | Cindy Grimm | 6/10/2011 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/955/SketchStudyTechReport.pdf | We present the results of an observational study on sketching. Artists were asked to sketch a small number of objects and comment on how and why they made the marks they did. We summarize these findings, from low-level details on individual marks through the drawing construction order. Based on these observations we provide suggestions for future research directions in 3D sketching.
| cmg@wustl.edu | | Drawing study description |
| 2011-54 | | MS Thesis | Christopher Thomas | 5/25/2011 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/954/NS-3 Simulation of WiMAX Networks.pdf | Simulation is a powerful tool for analysis and improvement of networking technologies, and many simulation packages are available. One that is growing in popularity is NS-3, the successor to the popular NS-2. It is a significant departure from NS-2, and offers many advantages and disadvantages. In this thesis, we translate and update a sophisticated WiMAX simulation model from NS-2 to NS-3, and use this experience to investigate the major differences between NS-2 and NS-3, and the relative strengths of each package. We then use the NS-3 simulation model to provide analysis on a new WiMAX OFDMA downlink subframe mapping algorithm. | xtownaga@gmail.com | Advisor: Raj Jain | |
| 2011-56 | | Other | Jeremy Buhler | 5/16/2011 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/953/gblast-reference-guide[1].pdf | This guide documents the operation of the Mercury BLASTN system for hardware-accelerated DNA similarity search. It includes detailed information on the syntax and limitations of the system's component commands, as well as a description of the system's hardware platform suitable for administrators who need to maintain a Mercury BLASTN system. Mercury BLASTN is a product of the High Performance COmputational Biology Group at Washington University. | jbuhler@wustl.edu | | Homepage of the High Performance Computational Biology Group |
| 2011-48 | | MS Thesis | Clifton M. Carey, Jr. | 5/3/2011 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/951/Carey_thesis.pdf | Copy Number Variations (CNVs) are a significant source of human genetic diversity and are believed to be responsible for a wide variety of phenotypic variation. Recent advances in microarray-based genomic hybridization techniques have facilitated CNV analysis as a viable diagnostic technique in the clinic, and several public databases of well-characterized CNVs are being compiled, but a standard for interpreting uncharacterized CNVs has yet to emerge. This thesis examines the clinical interpretation of uncharacterized CNVs as a multiple instance binary classification problem. We analyze the current state of clinical techniques, then present and test a novel, statistical approach to the problem. | | | |
| 2011-47 | | MS Thesis | Trung Duc Nguyen | 5/3/2011 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/950/Trung-Duc-Nguyen-MS-thesis.pdf | In bio-medical imaging, multi-partitioning surface networks (MPSNs) are very useful to model complex organs with multiple anatomical regions, such as a mouse brain. However, MPSNs are usually constructed from image data and might contain complex geometric and topological features. There has been much research on reducing the geometric complexity of a general surface (non-manifold or not) and the topological complexity of a closed, manifold surface. But there has been no attempt so far to reduce redundant topological features which are unique to non-manifold surfaces, such as curves and points where multiple sheets of surfaces join. In this thesis, we design interactive and automated means for removing redundant non-manifold topological features in MPSNs, which is a special class of non-manifold surfaces. The core of our approach is a mesh surgery operator that can effectively simplify the non-manifold topology while preserving the validity of the MPSN. The operator is implemented in an interactive user interface, allowing user-guided simplification of the input. We further develop an automatic algorithm that invokes the operator following a greedy heuristic. The algorithm is based on a novel, abstract representation of a non-manifold surface as a graph, which allows efficient discovery and scoring of possible surgery operations without the need for explicitly performing the surgeries on the mesh geometry. | | | |
| 2011-46 | | MS Project Report | Matthew Klein | 5/2/2011 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/949/KleinMatthew_project.pdf | N/A | | | |
| 2011-45 | | Other | Abusayeed Saifullah and Kunal Agrawal | 3/21/2011 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/948/RTSS2011_Saifullah.pdf | Multi-core processors over a significant performance increase over single-core processors. Therefore, they have the potential to enable computation-intensive real-time applications with stringent timing constraints that cannot be met on traditional single-core processors. However, most results in traditional multiprocessor real-time scheduling are limited to sequential programming models and ignore intra-task parallelism. In this paper, we address the problem of scheduling periodic parallel tasks with implicit deadlines on multi-core processors. We first consider a synchronous task model where each task consists of segments, each segment having an arbitrary number of parallel threads that synchronize at the end of the segment. We propose a new task decomposition method that decomposes each parallel task into a set of sequential tasks. We prove that our task decomposition achieves a resource augmentation bound of 4 and 5 when the decomposed tasks are scheduled using global EDF and partitioned deadline monotonic scheduling, respectively. Finally, we extend our analysis to a directed acyclic graph (DAG) task model where each node in the DAG has a unit execution requirement. We show how these tasks can be converted into synchronous tasks such that the same decomposition can be applied and the same augmentation bounds hold. Simulations based on synthetic workload demonstrate that the derived resource augmentation bounds are safe and sufficient. | saifullaha@cse.wustl.edu | | |
| 2011-20 | | MS Project Report | Alex S. Broad | 3/7/2011 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/946/Broad_Alexander_MSProject.pdf | This project focuses on using Reinforcement Learning to optimize arm dynamics through muscle control for desired trajectories. The goal of this project was to create a tool that can be used to gain a better understanding of the arm’s muscles and collect information that is useful in many other disciplines such as biomechanics, anthropology, medicine and robotics. I developed biologically realistic models of primate arm's using Stanford’s SimTK software, an open-source tool for modeling musculoskeletal structures. I then made use of Differential Dynamic Programming in order to generate novel movements from first principles and optimize motion over a specified trajectory. Lastly, I demonstrated the usefulness of this tool by examining its effectiveness in discovering the consequences of different muscle groups on the optimal behavior policy. | | | |
| 2011-5 | | Other | Xueyang Feng, You Xu, Yixin Chen, Yinjie Tang | 2/21/2011 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/945/DFBA-1.pdf | Shewanella bacteria are facultative anaerobes isolated from aquatic and sedimentary environments (Hau and Gralnick 2007) with a broad capacity for reduction of multiple electron receptors (Pinchuk et al. 2009; Serres and Riley 2006), including Fe(III), Mn(IV), sulfur, nitrate, and fumarate. With the accomplishment of complete genome sequencing of several Shewanella bacteria, the general pictures of the carbon metabolism have been revealed (Serres and Riley 2006). metabolism. One of the most physiological methods to decipher the time-variant metabolic regulation is to determine the dynamic distribution of intracellular metabolic fluxes since it reveals the final response of cellular metabolism to genomic, transcriptional and post-transcriptional regulations (Sauer 2006; Tang et al. 2009). In order to track the dynamic intracellular metabolic regulation, dynamic flux balance analysis (DFBA) was developed (Mahadevan et al. 2002), in which cell growth phase was divided into numerous stages, assuming that at each stage a new metabolic steady state was maintained. All the metabolic fluxes were then searched to satisfy the objective functions set for each stage. By solving this nonlinear optimization model using a cutting-edge nonlinear optimization solver (IPOPT), we confirmed the changing of carbon sources for the growth of Shewanella oneidensis MR-1 and deciphered the dynamic regulation of intracellular metabolism. | | | |
| 2011-3 | | Other | Yong Fu¤, Nicholas Kottenstette†, Chenyang Lu¤, Xenofon D. Koutsoukos | 1/25/2011 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/942/tbmc_press.pdf | Real-time systems face significant challenges in thermal management with their adoption of modern multicore processors. While earlier research on feedback thermal control has shown promise in dealing with the uncertainties in the thermal characteristics, multicore processors introduce new challenges that cannot be handled by previous solutions designed for single-core processors. Multicore processors require the temperatures and real-time performance of multiple cores to be controlled simultaneously, leading to multi-input-multi-output (MIMO) control problems with inter-core thermal coupling. Furthermore, current Dynamic Voltage and Frequency Scaling (DVFS) mechanisms only support a finite set of states, leading to discrete control variables that cannot be handled by standard linear control techniques. This paper presents Real-Time Multicore Thermal Control (RT-MTC), the first feedback thermal control framework specifically designed for multicore real-time systems. RT-MTC dynamically enforces both the temperature and the CPU utilization bounds of a multicore processor through DVFS with discrete frequencies. RT-MTC employs a highly efficient controller that integrates saturation and proportional control components rigorously designed to enforce the desired core temperature and CPU utilization bounds. It handles discrete frequencies through a PulseWidth Modulation (PWM) that achieves effective thermal control by manipulating the dwelling time of discrete frequencies. As a result RT-MTC can achieve effective thermal control with only a small number of frequencies typical in current processors. The robustness and advantages of RTMTC over existing thermal control approaches are demonstrated through extensive simulations under a wide range of uncertainties in term of power consumption. | fuy@cse.wustl.edu | | |
| 2011-1 | | PhD Dissertation | ARPITH CHACKO JACOB | 1/18/2011 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/941/thesis-main.pdf | The rapid growth of biosequence databases over the last decade has led to a performance bottleneck in the applications analyzing them. In particular, over the last five years DNA sequencing capacity of next-generation sequencers has been doubling every six months as costs have plummeted. The data produced by these sequencers is overwhelming traditional compute systems. We believe that in the future compute performance, not sequencing, will become the bottleneck in advancing genome science. In this work, we investigate novel computing platforms to accelerate dynamic programming algorithms, which are popular in bioinformatics workloads. We study algorithm-specific hardware architectures that exploit fine-grained parallelism in dynamic programming kernels using field-programmable gate arrays (FPGAs). We advocate a high-level synthesis approach, using the recurrence equation abstraction to represent dynamic programming and polyhedral analysis to exploit parallelism. We suggest a novel technique within the polyhedral model to optimize for throughput by pipelining independent computations on an array. This design technique improves on the state of the art, which builds latency-optimal arrays. We also suggest a method to dynamically switch between a family of designs using FPGA reconfiguration to achieve a significant performance boost. We have used polyhedral methods to parallelize the Nussinov RNA folding algorithm to build a family of accelerators that can trade resources for parallelism and are between 15-130x faster than a modern dual core CPU implementation. A Zuker RNA folding accelerator we built on a single workstation with four Xilinx Virtex 4 FPGAs outperforms 198 3 GHz Intel Core 2 Duo processors. Furthermore, our design running on a single FPGA is an order of magnitude faster than competing implementations on similar-generation FPGAs and graphics processors. Our work is a step toward the goal of automated synthesis of hardware accelerators for dynamic programming algorithms. | jarpith@cse.wustl.edu | | http://www.cse.wustl.edu/~jarpith/ |
| 2010-71 | | Other | Daniel A. Lazewatsky and William D. Smart | 12/20/2010 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/940/wu_telepresence.pdf | Most commercially-available robots are either aimed at the research community, or are designed with a single purpose in mind. The extensive hobbyist community has tended to focus on the hardware and the low-level software aspects. We claim that there is a need for a low-cost, general-purpose robot, accessible to the hobbyist community, with sufficient computation and sensing to run ``research-grade'' software. In this paper, we describe the design and implementation of such a robot. We explicitly outline our design goals, and show how a capable robot can be assembled from off-the-shelf parts, for a modest cost, by a single person with only a few tools. We also show how the robot can be used as a low-cost telepresence platform, giving the system a concrete purpose beyond being a low-cost development platform. | dlaz@cse.wustl.edu | | |
| 2010-70 | | MS Thesis | Ananth Mohan | 12/10/2010 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/939/AnanthMohan_thesis.pdf | Learning how to rank a set of objects relative to an user defined query has received much interest in the machine learning community during the past decade. In fact, there have been two recent competitions hosted by internationally prominent search companies to encourage research on ranking web site documents. Recent literature on learning to rank has focused on three approaches: point-wise, pair-wise, and list-wise. Many different kinds of classifiers, including boosted decision trees, neural networks, and SVMs have proven successful in the field. This thesis surveys traditional point-wise techniques that use regression trees for web-search ranking. The thesis contains empirical studies on Random Forests and Gradient Boosted Decision Trees, with novel augmentations to them on real world data sets. We also analyze how these point-wise techniques perform on new areas of research for web-search ranking: transfer learning and feature-cost aware models.
| | | |
| 2010-62 | | Other | Gregory Hackmann, Minmin Chen, Octav Chipara, Chenyang Lu, Yixin Chen, Marin Kollef, Thomas C. Bailey | 11/9/2010 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/938/amia11.pdf | Clinical study has found early detection and intervention to be essential for preventing clinical deterioration in patients at general hospital units. In this paper, we envision a two-tiered early warning system designed to identify the signs of clinical deterioration and provide early warning of serious clinical events. The first tier of the system automatically identifies patients at risk of clinical deterioration from existing electronic medical record databases. The second tier performs real-time clinical event detection based on real-time vital sign data collected from on-body wireless sensors attached to those high-risk patients. We employ machine-learning techniques to analyze data from both tiers, assigning scores to patients in real time. The assigned scores can then be used to trigger early-intervention alerts. Preliminary study of an early warning system component and a wireless clinical monitoring system component demonstrate the feasibility of this two-tiered approach. | | | |
| 2010-61 | | Other | Abusayeed Saifullah, You Xu, Chenyang Lu, and Yixin Chen | 11/3/2010 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/937/Priority_Assignment.pdf | Recent years have witnessed the adoption of wireless sensor-actuator networks as a communication infrastructure for process control applications. An important enabling technology for industrial process control is WirelessHART, an open wireless sensor-actuator network standard specifically developed for process industries. A key challenge faced byWirelessHART networks is to meet the stringent real-time communication requirements imposed by feedback control systems in process industries. Fixed priority scheduling, a popular scheduling policy in real-time networks, has recently been shown to be an effective real-time transmission scheduling policy in WirelessHART networks. Priority assignment has a major impact on the schedulability of real-time flows in these networks. This paper investigates the open problem of priority assignment for periodic real-time flows for feedback control loops closed through a WirelessHART network. We first propose an optimal priority assignment algorithm based on branch and bound for any given worst case delay analysis. We then propose an efficient heuristic search algorithm for priority assignment. We also identify special cases where the heuristic search is optimal. Simulations based on random networks and the real topology of a physical sensor network testbed showed that the heuristic search algorithm achieved near optimal performance in terms of schedulability, while significantly outperforming traditional real-time priority assignment policies. | saifullaha@cse.wustl.edu | | |
| 2010-60 | | PhD Dissertation | Manfred Georg | 10/28/2010 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/936/diss.pdf | This dissertation addresses the problem of parameterizing object motion within a set of images taken with a stationary camera. We develop data-driven methods across all image scales: characterizing motion observed at the scale of individual pixels, along extended structures such as roads, and whole image deformations such as lungs deforming over time. The primary contributions include (a) fundamental studies of the relationship between spatio-temporal image derivatives accumulated at a pixel, and the object motions at that pixel, (b) data driven approaches to parameterize breath motion and reconstruct lung CT data volumes, and (c) defining and offering initial results for a new class of Partially Unsupervised Manifold Learning (PUML) problems, which often arise in medical imagery. Specifically, we create energy functions for measuring how consistent a given velocity vector is with observed spatio-temporal image derivatives. These energy functions are used to fit parametric snake models to roads using velocity constraints. We create an automatic data-driven technique for finding the breath phase of lung CT scans which is able to replace external belt measurements currently in use clinically. This approach is extended to automatically create a full deformation model of a CT lung volume during breathing or heart MRI during breathing and heartbeat. Additionally, motivated by real use cases, we address a scenario in which a dataset is collected along with meta-data which describes some, but not all, aspects of the dataset. We create an embedding which displays the remaining variability in a dataset after accounting for variability related to the meta-data.
| manfred.georg@gmail.com | Dissertation of Manfred Georg Committee: Robert Pless, Chair Guy Genin Cindy Grimm Tao Ju William Richard John Shareshian
| |
| 2010-58 | | Other | Shobana Padmanabhan, Yixin Chen, and Roger D. Chamberlain | 10/19/2010 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/935/spadmanabhan_ip11.pdf | Many embedded and scientific applications are frequently pipelined asynchronously and deployed on architecturally diverse systems to meet performance requirements and resource constraints. We call such pipelined applications streaming applications. Typically, there are several design parameters in the algorithms and architectures used that, when customized, impact the tradeoff between different metrics of application performance as well as resource utilization. Automatic exploration of this design space is the goal of this research. When using architecturally diverse systems to accelerate streaming applications, the design search space is often complex. We present a global optimization framework comprising a novel domain-specific variation of branch-and-bound that reduces search complexity by exploiting the topology of the application's pipelining. We exploit the topological information to discover decomposability through the canonical Jordan block form and to originate two novel decomposition techniques that we name linear chaining and convex decomposition. The reduction in search complexity that we achieve for two prototypical real-world streaming applications is significant. For one application the size of the search space reduces from about 10^18 to 10^12, a million-fold reduction, and for the second application, the reduction is a factor of at least 20~million. | spadmanabhan@wustl.edu | | |
| 2010-57 | | Other | Huang-Ming Huang, Christopher Gill, Chengyang Lu | 10/14/2010 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/934/mcflow1_a.pdf | Modern computer architectures have evolved from uni-processor platforms to multi-processor and multi-core plat- forms, but traditional real-time distributed middleware such as RT-CORBA has not kept pace with that evolution. To address those issues, this paper describes the design and implementation of MCFlow, a new real-time distributed middleware for dependent task graphs running on multi-core platforms. MCFlow provides the following contributions to the state of the art in real-time middleware: (1) it provides an efficient C++ based component model through which computations can be configured flexibly for execution within a single core, across cores of a common host, or spanning multiple hosts; (2) it allows optimizations for inter-component communication to avoid data copying without sacrificing the parallel executability of data dependent tasks; (3) it strictly separates timing and functional concerns of an application so that they can evolve and can be configured independently; and (4) it provides a novel event dispatching architecture that uses lock free algorithms to avoid mutex locking and reduce memory contention, CPU context switching, and priority inversion. We also present an empirical evaluation that demonstrates the efficacy of our approach. | {hh1, cdgill, lu}@cse.wustl.edu | | |
| 2010-38 | | Other | Sisu Xi, Justin Wilson, Chenyang Lu, Christopher Gill | 10/11/2010 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/931/ecrts11_rtxen.pdf | The ability to build systems of systems is vital to the real-time community as we endeavor to build increasingly sophisticated systems. Subsystems must be modular and isolated to ensure desired real-time performance when assembled by the system integrator. While virtualization has gained broad adoption and success for system integration in non-real-time domains, no existing virtual machine manager is designed to enforce real-time performance. On the other hand, while a rich body of promising theories on hierarchical real-time scheduling algorithms has been proposed in the literature to support composable real-time performance guarantees, they have not been incorporated and evaluated in real virtualized systems. We have developed RT-Xen, the first virtual machine manager designed to support real-time domains through hierarchical real-time scheduling. RT-Xen bridges the gap between virtualization and hierarchical scheduling by introducing a hierarchical real-time scheduling framework in Xen, the most widely used open-source virtual machine manager. Experimental evaluation revealed important insights on the feasibility and efficacy real-time virtualization: (1) RT-Xen can effective provide virtualization and scheduling to Linux guest OSes at 1~ms scheduling resolution (quantum) while incurring moderate overhead; (2) RT-Xen can provide real-time performance isolation against overrun domains; (3) RT-Xen significantly outperformed the standard credit-based Xen scheduler in term of real-time performance. RT-Xen represents a promising step toward real-time virtualization for system integration of real-time applications. | xisisu@gmail.com | | |
| 2010-36 | | Other | Abusayeed Saifullah, You Xu, Chenyang Lu, and Yixin Chen | 10/11/2010 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/930/WH_SCHEDULABILITY.pdf | The WirelessHART standard has been specifically designed for real-time communication between sensor and actuator devices for industrial process monitoring and control. End-to-end communication delay analysis for WirelessHART networks is required for acceptance test of real-time data flows from sensors to actuators and for workload adjustment in response to network dynamics. In this paper, we map the scheduling of real-time periodic data flows in a WirelessHART network to real-time multiprocessor scheduling. We, then, exploit the response time analysis for multiprocessor scheduling and propose a novel method for the end-to-end delay analysis of the real-time flows that are scheduled using a fixed priority scheduling policy in a WirelessHART network. Simulations based on both random topologies and real network topologies of a physical testbed demonstrate the efficacy of our end-to-end delay analysis in terms of acceptance ratio under various fixed priority scheduling policies. | saifullaha@cse.wustl.edu | | http://www.cse.wustl.edu/~saifullaha |
| 2010-35 | | Other | Mo Sha, Gregory Hackmann, Chenyang Lu | 10/9/2010 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/929/ARCH.pdf | Home area networks (HANs) promise to enable sophisticated home automation applications such as smart energy usage and assisted living. However, recent empirical study of HAN reliability in real-world residential environments revealed significant challenges to achieving reliable performance in the face of significant and variable interference from a multitude of coexisting wireless devices. We propose the Adaptive and Robust Channel Hopping (ARCH) protocol: a lightweight receiveroriented protocol which handles the dynamics of residential environments by reactively channel hopping when channel conditions have degraded. ARCH has several key features. First, ARCH is an adaptive protocol that channel-hops based on changes in channel quality observed in real time. Second, ARCH is a distributed protocol that selects channels on a per-link basis, due to the large link-to-link variations in channel quality observed under empirical study. Third, ARCH is designed to be robust and lightweight. ARCH uses a practical hand-shaking approach to handle channel desynchronization and an efficient slidingwindow scheme that does not involve expensive calculations or modeling, and can be reasonably implemented on memoryconstrained wireless sensor platforms. Fourth, ARCH introduces minimal communication overhead for applications where packet acknowledgements are already enabled. We evaluate our approach through real deployment in real-life apartments with residents’ daily activity. Our results demonstrate that ARCH can reduce the number of packet retransmissions by a median of 42.3% compared to using a single, fixed wireless channel, and can enable up to a 2.2 improvement in delivery rate on the most unreliable links in our experiment. Under a multi-hop routing scenario, ARCH achieved an average 31.6% reduction in radio usage, by reducing the ETX along each path by up to 83.6%. Due to ARCH’s lightweight reactive design, most links achieve this improvement in reliability with 10 or fewer channel hops per day. | | | |
| 2010-34 | | PhD Dissertation | Charles Wiseman | 9/23/2010 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/928/dissertation.pdf | Virtualized network infrastructures are currently deployed in both research and commercial contexts. The complexity of the virtualization layer varies greatly in different deployments, ranging from cloud computing environments, to carrier Ethernet applications using stacked VLANs, to networking testbeds. In all of these cases, many users are sharing the resources of one provider and each user expects their resources to be isolated from all other users. There are many challenges associated with the control and management of these systems, including resource allocation and sharing, resource isolation, system security, and usability. Among the different types of virtualized infrastructures, network testbeds are of particular interest due to their widespread use in education and in the networking research community. Networking researchers rely extensively on testbeds when evaluating new protocols and ideas. Indeed, a substantial percentage of top research papers include results gathered from testbeds. Network emulation testbeds in particular are often used to conduct innovative research because they allow users to emulate diverse network topologies in a controlled environment. That is, researchers run experiments with a collection of resources that can be reconfigured to represent many different network scenarios. The user typically has control over most of the resources in their experiment which results in a high level of reproducibility. As such, these types of testbeds provide an excellent bridge between simulation and deployment of new ideas. Unfortunately, most testbeds suffer from a general lack of resource extensibility and diversity. This dissertation extends the current state of the art by designing a new, more general testbed infrastructure that expands and enhances the capabilities of modern testbeds. This includes pertinent abstractions, software design, and related algorithms. The design has also been prototyped in the form of the Open Network Laboratory network testbed, which has been successfully used in educational and research pursuits. While the focus is on network testbeds, the results of this research will also be applicable to the broader class of virtualized system infrastructures. | wiseman@wustl.edu | | |
| 2010-32 | | Other | Mo Sha, Gregory Hackmann, Chenyang Lu | 7/30/2010 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/926/infocom11.pdf | Home area networks (HANs) consisting of wireless sensors have emerged as the enabling technology for important applications such as smart energy and assisted living. A key challenge faced by HANs is maintaining reliable operation in real-world residential environments. This paper presents two in-depth empirical studies on the wireless channels in real homes. The spectrum study analyzes the spectrum usage in the 2.4 GHz band where wireless sensor networks based on the IEEE 802.15.4 standard must coexist with existing wireless devices. We characterize the ambient wireless environment in six apartments through passive spectrum analysis across the entire 2.4 GHz band over seven days in each of the apartments. Notably, we find that the wireless conditions in these residential environments can be much more complex and varied than in a typical office environment. Moreover, while 802.11 signals play a significant role in spectrum usage, there also exist non-negligible noise from non-802.11 devices. The multi-channel link study measures the reliability of different 802.15.4 channels through active probing with motes. We discover that there is not always a persistently reliable channel over 24 hours; that reliability is strongly correlated across adjacent channels; and that link reliability does not exhibit cyclic behavior at daily or weekly timescales. Nevertheless, reliability can be maintained through a small number of channel hops per day, suggesting channel diversity as a key tool for designing robust HANs in residential environments. Our empirical studies provide important guidelines and insights for robust wireless sensor network design in residential environments. | | | |
| 2010-31 | | Other | Subharthi Paul, Raj Jain, Jianli Pan, Chakchai So-in | 6/23/2010 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/925/Jain_techreport.pdf | The next generation Internet needs to support multiple diverse application contexts. In this paper, we present Internet 3.0, a diversified, multi-tier architecture for the next generation Internet. Unlike the current Internet, Internet 3.0 defines a new set of primitives that allows diverse applications to compose and optimize their specific contexts over resources belonging to multiple ownerships. The key design philosophy is to enable diversity through explicit representation, negotiation and enforcement of policies at the granularity of network infrastructure, compute resources, data and users. The basis of the Internet 3.0 architecture is a generalized three-tier object model. The bottom tier consists of a high-speed network infrastructure. The second tier consists of compute resources or hosts. The third tier consists of data and users. The “tiered” organization of the entities in the object model depicts the natural dependency relationship between these entities in a communication context. All communication contexts, including the current Internet, may be represented as special cases within this generalized three-tier object model. The key contribution of this paper is a formal architectural representation of the Internet 3.0 architecture over the key primitive of the “Object Abstraction” and a detailed discussion of the various design aspects of the architecture, including the design of the “Context Router-” the key architectural element that powers an evolutionary deployment plan for the clean slate design ideas of Internet 3.0. | {spaul, jain,jp10,cs5}@wustl.edu | | http://www.cse.wustl.edu/~jain/ |
| 2010-30 | | Other | Rahav Dor | 6/23/2010 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/924/RahavDor_thesis.pdf | Hardware and software design requires the right portion of skills and mental faculties. The design of a good system is an exercise in rational thinking, engineering, and art. The design process is further complicated when we aspire to build systems that exploit parallelism or are targeted to be deployed on architecturally diverse computing devices, FPGAs or GPUs to name just a few. The need to develop systems that can take advantage of computing devices beyond general purpose CPUs is real. There are several application domains and research efforts that will simply not be able to adequately perform or yield answers in a reasonable amount of time otherwise. Developing a mathematical model of a system is a key stepping stone to a high performance system, but often is absent from the design process due to the complexity of the model development. In this paper we offer an easy, yet solid approach to the development of such mathematical models. This adds a little bit more weight to engineering side of the design process in the form of a quantifiable method that enables designers to reason about their systems, identify bottlenecks, and gain vital information for performance improvements. | rahav.dor@wustl.edu | | |
| 2010-29 | | Other | Qiang Lu and You Xu and Ruoyun Huang and Yixin Chen | 6/12/2010 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/923/parallel-mrw.pdf | Graph search has been employed by many AI techniques and applications. A natural way to improve the efficiency of search is to utilize ad- vanced, more powerful computing platforms. However, expensive computing infrastructures, such as supercomputers and large-scale clusters, are traditionally available to only a limited number of projects and researchers. As a results, most AI applications, with access to only commodity com- puters and clusters, cannot benefit from the efficiency improvements of high-performance parallel search algorithms. Cloud computing provides an attractive, highly accessible alternative to other traditional high- performance computing platforms. In this paper, we first show that the run-time of our stochastic search algorithm in planning is a heavy-tailed distribution, which has a remarkable variability. Second, we propose an algorithm framework that takes advantage of cloud computing.
| | | |
| 2010-28 | | Other | Tom Erez and William D. Smart | 5/27/2010 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/922/uai10_1.pdf | Partially-Observable Markov Decision Processes (POMDPs) are typically solved by finding an approximate global solution to a corresponding belief-MDP. In this paper, we offer a new planning algorithm for POMDPs with continuous state, action and observation spaces. Since such domains have an inherent notion of locality, we can find an approximate solution using local optimization methods. We parameterize the belief distribution as a Gaussian mixture, and use the Extended Kalman Filter (EKF) to approximate the belief update. Since the EKF is a first-order filter, we can marginalize over the observations analytically. By using feedback control and state estimation during policy execution, we recover a behavior that is effectively conditioned on incoming observations despite the unconditioned planning. Local optimization provides no guarantees of global optimality, but it allows us to tackle domains that are at least an order of magnitude larger than the current state-of-the-art. We demonstrate the scalability of our algorithm by considering a simulated hand-eye coordination domain with 16 continuous state dimensions and 6 continuous action dimensions. | etom@cse.wustl.edu | | |
| 2010-27 | | Other | Terry Tidwell, Robert Glaubius, Christopher D. Gill and William D. Smart | 5/25/2010 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/921/rtssSubmission1.pdf | Classical scheduling abstractions such as deadlines and priorities do not readily capture the complex timing semantics found in many real-time cyber-physical systems. Time utility functions provide a necessarily richer description of timing semantics, but designing utility-aware scheduling policies using them is an open research problem. In particular, optimal utility accrual scheduling design is needed for real-time cyber-physical domains. In this paper we design optimal utility accrual scheduling policies for cyber-physical systems with periodic, non-preemptable tasks that run with stochastic duration. These policies are derived by solving a Markov Decision Process formulation of the scheduling problem. We use this formulation to demonstrate that our technique improves on existing heuristic utility accrual scheduling policies. | | | |
| 2010-26 | | Other | Abusayeed Saifullah, Chenyang Lu, You Xu, and Yixin Chen | 5/14/2010 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/920/RTSS_2010_WH.pdf | WirelessHART is an open wireless sensor-actuator network standard for industrial process monitoring and control that requires real-time data communication between sensor and actuator devices. Salient features of a WirelessHART network include a centralized network management architecture, multi-channel TDMA transmission, redundant routes, and avoidance of spatial reuse of channels for enhanced reliability and real-time performance. This paper makes several key contributions to real-time transmission scheduling in WirelessHART networks: (1) formulation of the end-to-end real-time transmission scheduling problem based on the characteristics of WirelessHART; (2) proof of NP-hardness of the problem; (3) an optimal branch-and-bound scheduling algorithm based on a necessary condition for schedulability; and (4) an efficient and practical heuristic-based scheduling algorithm called Conflict-aware Least Laxity First (C-LLF). Extensive simulations based on both random topologies and real network topologies of a physical testbed demonstrate that C-LLF is highly effective in meeting end-to-end deadlines in WirelessHART networks, and significantly outperforms common real-time scheduling policies. | saifullaha@cse.wustl.edu | | http://www.cse.wustl.edu/~saifullaha |
| 2010-25 | | MS Thesis | Simon Tam | 5/12/2010 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/919/Tam_thesis.pdf | Stroke is the leading cause of disability in the industrialized world. The disabilities caused by stroke can significantly impact the quality of daily life for stroke survivors. Studies show that a high number of daily body movement repetitions can help recover lost motor functionality. However, many stroke patients do not perform the home exercises needed to reach this high number of repetitions. Games for rehabilitation have been shown to provide motivation to perform home exercises. However, because stroke can leave survivors with different levels of disability, custom games are required. Creating custom games, though, is time consuming and requires advanced programming knowledge. A tool for quick and easy game development is a solution to this problem. In this master’s project, I collaborated with a doctoral occupational therapy student to take steps towards developing a game authoring environment custom to therapists. | | | |
| 2010-24 | | PhD Dissertation | Octav Chipara | 5/11/2010 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/918/chipara_final.pdf | Wireless sensor networks are poised to change the way computer systems interact with the physical world. We plan on entrusting sensor systems to collect medical data from patients, monitor the safety of our infrastructure, and control manufacturing processes in our factories. To date, the focus of the sensor network community has been on developing best-effort services. This approach is insufficient for many applications since it does not enable developers to determine if a system's requirements in terms of communication latency, bandwidth utilization, reliability, or energy consumption are met. The focus of this thesis is to develop real-time network support for such critical applications. The main theoretical contribution of this thesis is the development of novel transmission scheduling techniques optimized for data collection applications. This work bridges the gap between wireless sensor networks and real-time scheduling theory, which have traditionally been applied to processor scheduling. The thesis also focuses on the development of a real-time patient monitoring system for general hospital units. We show that it is feasible to use wireless sensor networks for clinical monitoring through a study in a step-down cardiac care unit at Barnes-Jewish Hospital.
| | | |
| 2010-23 | | Other | Jwalant Ahir; Jeremy Buhler; Roger D. Chamberlain | 5/11/2010 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/917/ahir_et_al1.pdf | Biosequence similarity search is an important application in computational biology. Mercury BLASTN, an FPGA-based implementation of BLAST for DNA, is one of the alternatives for fast DNA sequence comparison. The re-design of BLAST into a streaming application combined with a high-throughput hardware pipeline have enabled Mercury BLAST to emerge as one of the fastest implementations of bio-sequence similarity search. This performance can be further enhanced by exploiting the data-level parallelism present within the application. Here we present a multiple FPGA-based Mercury BLASTN design in order to double the speed and throughput of DNA sequence computation. This paper describes a dual Mercury BLASTN design, the detailed design of the split and merge functions, and simulation results. | roger@wustl.edu | | |
| 2010-21 | | Other | Roger D. Chamberlain; Greg A. Galloway; Mark A. Franklin | 5/7/2010 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/916/Chamberlain_etal.pdf | Expressing concurrency in applications has always been a difficult and error-prone endeavor, yet effective utilization of multi-core processors requires that the concurrency in applications be understood. One approach to the expression of concurrency is streaming, which has shown real promise as a safe and effective method for many application classes. Here, we express a classic problem, sorting, in the streaming paradigm and explore the implications of various algorithm and architectural design parameters on the performance of the application.
| | | |
| 2010-17 | | PhD Dissertation | Nathan Jacobs | 5/4/2010 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/915/dissertation_jacobs.pdf | The web has an enormous collection of live cameras that capture images of roads, beaches, cities, mountains, buildings, and forests. With an appropriate foundation, this massively distributed, scalable, and already existing network of cameras could be used as a new global sensor. The sheer number of cameras and their broad spatial distribution prompts new questions in computer vision: How can you automatically geo-locate each camera? How can you learn the 3D structure of the scene? How can you correct the color measurements from the camera? What environmental properties can you extract? We address these questions using models of the geometry and statistics of natural outdoor scenes, and with an understanding of how image appearance varies due to transient objects, the weather, the time of day, and the season. We show algorithms to calibrate cameras and annotate scenes that formalize this understanding in classical linear and nonlinear optimization frameworks. Our work uses images captured at long temporal scales, ranging from minutes to years; often these images are from a database of images we created by capturing images from 1000 webcams every half hour. By exploring natural cues capable of working over such long time scales on a broad range of scenes, we deepen our understanding of the interplay between geographic location, time, and the causes of image appearance change.
| jacobsn@gmail.com | | http://www.cse.wustl.edu/~jacobsn/ |
| 2010-16 | | PhD Dissertation | Sasakthi S. Abeysinghe | 4/30/2010 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/913/AbeysingheSasakthi.pdf | Electron Cryo-Microscopy or cryo-EM is an area that has received much attention in the recent past. Compared to the traditional methods of X-Ray Crystallography and NMR Spectroscopy, cryo-EM can be used to image much larger complexes, in many different conformations, and under a wide range of biochemical conditions. This is because it does not require the complex to be crystallisable. However, cryo-EM reconstructions are limited to intermediate resolutions, with the state-of-the-art being 3.6A, where secondary structure elements can be visually identified but not individual amino acid residues. This lack of atomic level resolution creates new computational challenges for protein structure identification. In this dissertation, we present a suite of geometric algorithms to address several aspects of protein modeling using cryo-EM density maps. Specifically, we develop novel methods to capture the shape of density volumes as geometric skeletons. We then use these skeletons to find secondary structure elements (SSEs) of a given protein, to identify the correspondence between these SSEs and those predicted from the primary sequence, and to register high-resolution protein structures onto the density volume. In addition, we designed and developed Gorgon, an interactive molecular modeling system, that integrates the above methods with other interactive routines to generate reliable and accurate protein backbone models. | sasakthi@gmail.com | | http://cse.wustl.edu/~ssa1 |
| 2010-15 | | Other | William D. Smart, Annamaria Pileggi, and Leila Takayama | 4/7/2010 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/912/tr.pdf | This is a collection of papers presented at the workshop "What Do Collaborations with the Arts Have to Say About HRI", held at the 2010 Human-Robot Interaction Conference, in Osaka, Japan. | wds@cse.wustl.edu | | |
| 2010-12 | | Other | Shobana Padmanabhan, Roger D. Chamberlain, Yixin Chen | 3/30/2010 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/911/shobanapadmanabhan2010.pdf | High-performance streaming applications are typically pipelined and deployed on architecturally diverse (hybrid)systems. Developers of such applications are interested in customizing components used, so as to benefit application performance. We present an efficient and automatic technique for design-space exploration of applications in this problem domain. We solve performance tuning as an optimization problem by formulating cost functions using results from queueing theory. This results in a mixed-integer nonlinear optimization problem which is NP-hard. We reduce the search complexity by decomposing the search space. We have developed a domain-specific decomposition technique using topological information of the application embodied in the queueing network models. Our analysis includes when our decomposition preserves optimality. Our preliminary empirical results confirm two-fold benefits--solving a problem that is currently not solvable using state-of-the-art solvers and in some problem instances, improving initial solution value from the solver by over two orders of magnitude.
| shobana@arl.wustl.edu | | |
| 2010-11 | | Other | Manfred Georg, Tobias Preusser, and Horst K. Hahn | 3/11/2010 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/910/idealvessel_1.pdf | We present a framework for the construction of vascular systems based on optimality principles of theoretical physiology. Given the position and flow distribution of end points of a vascular system, we construct the topology and positions of internal nodes to complete the vascular system in a realistic manner. Optimization is driven by intravascular volume minimization with constraints derived from physiological principles. Direct optimization of a vascular system, including topological changes, is used instead of simulating vessel growth. A good initial topology is found by extracting key information from a previously optimized model with less detail. This technique is used iteratively in a multi-level approach to create a globally optimized vascular system.
| mgeorg@cse.wustl.edu | Most of this work was completed at Fraunhofer MeVis during the summer of 2004. | |
| 2010-10 | | PhD Dissertation | Yuanfang Zhang | 8/1/2008 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/909/yfzhang_thesis.pdf | While traditional real-time middleware such as Real-Time CORBA have shown promise as distributed software platforms for systems with time constraints, existing middleware systems lack (1) schedulability analysis and run-time enforcement mechanisms needed to give online real-time guarantees for aperiodic tasks, (2) flexible configuration mechanisms needed to manage end-to-end timing easily for a wide range of dfferent distributed real-time systems with both periodic and aperiodic tasks, and (3) support for scheduling soft real-time tasks on multicore platforms while guaranteeing their time constraints will be satisfied. To address the limitations of current generation real-time middleware, this dissertation makes three major contributions to the state of the art in real-time middleware: we have developed (1) the first instantiation of an integrated scheduling framework supporting both periodic and aperiodic tasks (on TAO, a widely used Real-Time CORBA middleware); (2) the first configurable component middleware services for admission control and load balancing of periodic and aperiodic tasks in distributed real-time systems (on CIAO, a Component-Integrated Real-Time middleware); and (3) MC-ORB, the first real-time object request broker (ORB) designed to exploit the features of multicore platforms, with admission control and task allocation services that can provide schedulability guarantees for soft real-time tasks on multicore platforms.
| | | |
| 2010-8 | | MS Thesis | Emily M. Yang | 1/8/2010 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/908/EmilyYang_Thesis.pdf | Persons who have suffered from stroke participate in occupational therapy to help recover occupational functionality, but therapy is expensive and maximal recovery often depends on repetitive, tedious exercises to be done by patients both in therapy sessions and on their own. Often patients do not have the resources or motivation to complete the treatment required to give them the best results. This thesis is presented as part of a larger project in which we aim to enable occupational therapists to use the Looking Glass programming environment to create computer games for their patients that can be played inexpensively and effectively, both inside and outside of therapy sessions. Looking Glass will allow for occupational therapists with little or no programming background to write customized games for their patients. Through using Wii remotes and webcams to track movement and translate it to a computer game, this solution has the potential to provide a more engaging and interesting way for patients to correct ly do repetitive movements without needing constant therapist supervision or expensive and complicated equipment. It also can provide highly customizable and adjustable game settings to accommodate for the wide range of impairments that can result from stroke. This thesis presents a study of the needs of occupational therapists and stroke patients who compose the user base of the project and implications for the design, the development of a webcam color tracking system to be used for movement tracking in games, and an application to be used by therapists to assign specific, patient-tailored calibrations and game levels as part of treatments and to track and organize improvement statistics. These are all key components required for the successful development of the overall project. | emy1@wustl.edu | | |
| 2010-7 | | PhD Dissertation | Robert Glaubius | 1/7/2010 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/907/glaubius_dissertation.pdf | Scheduling policies for open soft real-time systems must be able to balance the competing concerns of meeting their objectives under exceptional conditions while achieving good performance in the average case. Balancing these concerns requires modeling strategies that represent the range of possible task behaviors, and solution techniques that are capable of effectively managing uncertainty in order to discover scheduling policies that are effective across the range of system modes. We develop methods for solving a particular class of task scheduling problems in an open soft real-time setting involving repeating, non-preemptable tasks that contend for a single shared resource. We enforce timeliness by optimizing performance with respect to the proportional progress of tasks in the system. We model this scheduling problem as an infinite-state Markov decision process, and provide guarantees regarding the existence of optimal solutions to this problem. We derive several methods for approximating optimal scheduling policies and provide theoretical justification and empirical evidence that these solutions are good approximations to the optimal solution. We consider cases in which task models are known, and adapt reinforcement learning methods to learn task models when they are not available. | rlg1@wustl.edu | | |
| 2010-6 | | MS Thesis | You Xu | 1/6/2010 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/906/youxu_thesis.pdf | Partial Order Reduction (POR) is a technique that reduces the search space by recognizing interchangeable orders between actions and expanding only a subset of all possible orders during the search. It has been extensively studied in model checking and has proven to be an enabling technique for reducing the search space and costs. Several POR algorithms have been proposed in planning, including the Expansion Core (EC) and Stratified Planning (SP) algorithms. Being orthogonal to the development of accurate heuristic functions, these reduction methods show great potential to improve the planning efficiency from a new perspective. However, it is unclear how these POR methods relate to each other and whether there exist stronger reduction methods. In this thesis, we have proposed a unifying theory that provides a necessary and sufficient condition for two actions to be semi-commutative. We have also revealed that semi-commutativity is the central property that enables POR. We have also interpreted both EC and SP algorithms using this new theory. Further, we have proposed new, stronger POR algorithms based on the new theory. We have also applied these new algorithms to solve benchmark problems across various planning domains. Experimental results have shown significant search cost reduction. | youxu@wustl.edu | | |
| 2010-4 | | PhD Dissertation | Chien-Liang Fok | 1/6/2010 | http://cse.wustl.edu/Research/Lists/Technical Reports/Attachments/905/fok_dissertation.pdf | Mobile ad hoc networks (MANETs) and wireless sensor networks (WSNs) are two recently-developed technologies that uniquely function without xed infrastructure support, and sense at scales, resolutions, and durations previously not possible. While both offer great potential in many applications, developing software for these types of networks is extremely dffcult, preventing their wide-spread use. Three primary challenges are (1) the high level of dynamics within the network in terms of changing wireless links and node hardware congurations, (2) the wide variety of hardware present in these networks, and (3) the extremely limited computational and energy resources available. Until now, the burden of handling these issues was put on the software application developer. This dissertation presents three novel programming models and middleware systems that address these challenges: Limone, Agilla, and Servilla. Limone reliably handles high levels of dynamics within MANETs. It does this through lightweight coordination primitives that make minimal assumptions about network connectivity. Agilla enables self-adaptive WSN applications via the integration of mobile agent and tuple space programming models, which is critical given the continuously changing network. It is the rst system to successfully demonstrate the feasibility of using mobile agents and tuple spaces within WSNs. Servilla addresses the challenges that arise from WSN hardware heterogeneity using principles of Service-Oriented Computing (SOC). It is the rst system to successfully implement the entire SOC model within WSNs and uniquely tailors it to the WSN domain by making it energy-aware and adaptive. The ecacies of the above three systems are demonstrated through implementation, micro-benchmarks, and the evaluation of several real-world applications including Universal Remote, Fire Detection and Tracking, Structural Health Monitoring, and Medical Patient Monitoring. | | | |