<?xml version="1.0" encoding="utf-8"?><rss version="2.0"><channel><title>MSU Computer Science and Engineering Technical Reports</title><link>http://www.cse.msu.edu/publications</link><description>Publications from Michigan State University's Department of Computer Science and Engineering</description><generator>kbg / http://www.cse.msu.edu/~acj/rss/msutechreports</generator><item><title>Bit Weaving: A Non-prefix Approach to Compressing Packet Classifiers in TCAMs</title><link>http://www.cse.msu.edu/cgi-user/web/tech/document?ID=922</link><description>Authors: &lt;a href="search?ANDOR=&amp;amp;AUTHORID=671"&gt;Chad R. Meiners&lt;/a&gt;, &lt;a href="search?ANDOR=&amp;amp;AUTHORID=666"&gt;Alex X. Liu&lt;/a&gt;, &lt;a href="search?ANDOR=&amp;amp;AUTHORID=124"&gt;Eric Torng&lt;/a&gt;&lt;P&gt; Ternary Content Addressable Memories (TCAMs) have become the de facto standard in industry for fast packet classification. Unfortunately, TCAMs have limitations of small capacity, high power consumption, high heat generation, and high cost. The well-known range expansion problem exacerbates the problem of smaller TCAMs by significantly decreasing the already limited capacity of these TCAMs as each classifier rule typically has to be converted to multiple TCAM rules. One method for coping with smaller TCAMs is to use compression schemes to reduce the number of TCAM rules required to represent a classifier. Although several TCAM-based classifier compression schemes have been proposed, they are all limited to producing prefix classifiers, which means that they all miss the compression opportunities created by non-prefix ternary classifiers. In this paper, we propose bit weaving, the first non-prefix classifier compression scheme. Bit weaving is based on the observation that adjacent TCAM entries that have a hamming distance of one (\ie, differ by only one bit) can be merged into one entry by replacing the bit in question with *. Bit weaving consists of two new techniques, \emph{bit swapping} and \emph{bit merging}, to first identify and then merge such rules together. The key advantages of bit weaving are that it runs fast and it is composable with other TCAM optimization methods as a pre/post-processing routine. We implemented bit weaving and conducted experiments on both real-world and synthetic packet classifiers. Our experimental results show the following: (i) bit weaving is an effective stand-alone compression technique (it achieves an average compression ratio of $23.6\%$) and (ii) bit weaving finds compression opportunities that other methods miss. Specifically, bit weaving improves the prior TCAM optimization techniques of TCAM Razor and Topological Transformation by an average of $12.8\%$ and $36.5\%$, respectively.  </description><pubDate>2009-11-23</pubDate></item><item><title>Quarnet: A Tool for Quantifying Static Network Reachability</title><link>http://www.cse.msu.edu/cgi-user/web/tech/document?ID=923</link><description>Authors: &lt;a href="search?ANDOR=&amp;amp;AUTHORID=688"&gt;Amir R. Khakpour&lt;/a&gt;, &lt;a href="search?ANDOR=&amp;amp;AUTHORID=689"&gt;Alex Liu&lt;/a&gt;&lt;P&gt; Quantifying static network reachability is useful in many aspects of network management: troubleshooting, maintenance, design, security auditing, etc. In this paper, for the first time, we propose a suite of algorithms for quantifying reachability based on the static configuration (mainly ACLs) of a network. We also present a network reachability model that considers connectionless and connection-oriented transport protocols, stateless and stateful routers/firewalls, static and dynamic NAT, PAT, etc. We implemented the algorithms in our network reachability monitoring tool called Quarnet and conducted experiments on a university network. Although computing static network reachability is expensive in nature, the experimental results show that Quarnet is efficient enough to be used in practice.  &lt;p&gt;&lt;a href="http://www.cse.msu.edu/publications/tech/TR/MSU-CSE-09-2.pdf"&gt;PDF&lt;/a&gt;</description><pubDate>2009-11-23</pubDate></item><item><title>Energy Efficient Routing in Mobile Wireless Sensor Networks</title><link>http://www.cse.msu.edu/cgi-user/web/tech/document?ID=925</link><description>Authors: &lt;a href="search?ANDOR=&amp;amp;AUTHORID=690"&gt;Fatme El&lt;/a&gt;, &lt;a href="search?ANDOR=&amp;amp;AUTHORID=124"&gt;Eric Torng&lt;/a&gt;&lt;P&gt; Energy optimization is an important problem in wireless sensor networks because typical battery powered nodes have a finite lifetime. We focus on networks with nodes that can move. We move nodes to reduce the energy consumed by transmissions taking into consideration the energy consumed by moving nodes. We consider two traffic patterns: a single source single destination pattern and a multiple sources single destination pattern. For both traffic patterns, we first propose a centralized scheme where one node acts as a controller, computes (ideally) optimal positions for other nodes, and transmits the new positions to the relevant nodes. The nodes then move to their computed positions and begin the transmission. Then we propose distributed protocols to extend our solution to applications with real-time requirements. Our experimental results show that our centralized algorithm can reduce energy consumption by roughly 20% when compared to the best existing approaches. Our distributed protocols also perform well (within 2% of our optimal centralized algorithm) for both traffic patterns.  </description><pubDate>2009-11-23</pubDate></item><item><title>System-level Calibration for Data Fusion in Wireless Sensor Networks</title><link>http://www.cse.msu.edu/cgi-user/web/tech/document?ID=926</link><description>Authors: &lt;a href="search?ANDOR=&amp;amp;AUTHORID=691"&gt;Rui Tan, Zhaohui Yuan, Guoliang Xing, Xue Liu, Jianguo Yao&lt;/a&gt;&lt;P&gt; Systematic biases in sensor measurements undermine the performance of wireless sensor networks in mission-critical applications such as target detection and tracking. Traditional device-level calibration approaches become intractable for moderate to large-scale networks due to limited access of individual sensors after deployment. In this paper, we propose a two-tier {\em system-level} calibration approach for a class of sensor networks that employ data fusion to improve the overall system performance. In the first tier, each sensor learns its local sensing model from noisy measurements using an online algorithm and only transmits a few model parameters. In the second tier, sensors' local sensing models are then calibrated to a common system sensing model. Our approach fairly distributes computation overhead among sensors and significantly reduces the communication overhead of system-level calibration. Based on this approach, we develop an {\em optimal} model calibration scheme that {\em maximizes} the target detection probability of a sensor network under bounded false alarm rate. The simulations based on synthetic data as well as real data traces collected by $18$ sensors show that our system-level calibration scheme can improve the detection performance of a sensor network by up to $50\%$.  &lt;p&gt;&lt;a href="http://www.cse.msu.edu/publications/tech/TR/MSU-CSE-09-5.pdf"&gt;PDF&lt;/a&gt;</description><pubDate>2009-11-23</pubDate></item><item><title>Towards Model Modeling of Biological Systems</title><link>http://www.cse.msu.edu/cgi-user/web/tech/document?ID=927</link><description>Authors: &lt;a href="search?ANDOR=&amp;amp;AUTHORID=692"&gt;Manuel I. Capel Tunon&lt;/a&gt;, &lt;a href="search?ANDOR=&amp;amp;AUTHORID=280"&gt;Laura K. Dillon&lt;/a&gt;, &lt;a href="search?ANDOR=&amp;amp;AUTHORID=693"&gt;Theresa M. Casey&lt;/a&gt;, &lt;a href="search?ANDOR=&amp;amp;AUTHORID=405"&gt;Betty H. C. Cheng&lt;/a&gt;&lt;P&gt; This technical report documents a preliminary case study in which hybrid automata were used to model dynamics of fat metabolism in rat dams during pregnancy. The model describes effects of lipoprotein lipase (LPL) and hormone sensitive lipase (HSL) enzyme activities in the mammary gland and adipose tissue on blood triglyceride concentration during pregnancy in rat dams. In future work, we will refine the model and tune it to fit with data collected from animal experiments. We will then assess the model's predictive ability. We will also investigate how the model can be used to validate the experimental data.  </description><pubDate>2009-11-23</pubDate></item><item><title>Iustitia: An Information Theoretical Approach to High-speed Flow Nature Identification</title><link>http://www.cse.msu.edu/cgi-user/web/tech/document?ID=928</link><description>Authors: &lt;a href="search?ANDOR=&amp;amp;AUTHORID=688"&gt;Amir R. Khakpour&lt;/a&gt;, &lt;a href="search?ANDOR=&amp;amp;AUTHORID=689"&gt;Alex Liu&lt;/a&gt;&lt;P&gt; This paper concerns the fundamental problem of identifying the content nature of a flow, namely text, binary, or encrypted, for the first time. We propose Iustitia, a tool for identifying flow nature on the fly. The key observation behind Iustitia is that text flows have the lowest entropy and the encrypted flows have the highest entropy, where the entropy of binary flows stands in between. The basic idea of Iustitia is to classify flows using machine learning techniques where a feature is the entropy of every certain number of consecutive bytes. The key features of Iustitia are high speed (10% of average packet inter-arrival time) and high accuracy (86%).  </description><pubDate>2009-11-23</pubDate></item><item><title>Developing and Applying Design Patterns for Dynamically Adaptive Systems</title><link>http://www.cse.msu.edu/cgi-user/web/tech/document?ID=929</link><description>Authors: &lt;a href="search?ANDOR=&amp;amp;AUTHORID=694"&gt;Andres J. Ramirez&lt;/a&gt;, &lt;a href="search?ANDOR=&amp;amp;AUTHORID=460"&gt;Betty H.C. Cheng&lt;/a&gt;</description><pubDate>2009-11-23</pubDate></item><item><title>QVS: Quality-aware Voice Streaming for Wireless Sensor Networks</title><link>http://www.cse.msu.edu/cgi-user/web/tech/document?ID=930</link><description>Authors: &lt;a href="search?ANDOR=&amp;amp;AUTHORID=695"&gt;Liqun Li&lt;/a&gt;, &lt;a href="search?ANDOR=&amp;amp;AUTHORID=679"&gt;Guoliang Xing&lt;/a&gt;, &lt;a href="search?ANDOR=&amp;amp;AUTHORID=696"&gt;Limin Sun&lt;/a&gt;, &lt;a href="search?ANDOR=&amp;amp;AUTHORID=697"&gt;Yan Liu&lt;/a&gt;&lt;P&gt; Recent years have witnessed the pilot deployments of audio or low-rate video wireless sensor networks for a class of mission-critical applications including search and rescue, security surveillance and disaster management. In this paper, we report the design and implementation of Quality-aware Voice Streaming (QVS) for wireless sensor networks. QVS is built upon SenEar, a new sensor hardware platform we developed for high-bandwidth wireless audio communication. QVS comprises several novel components, which include an empirical model for online voice quality evaluation and control, dynamic voice compression/duplication adaptation for lossy wireless links, and distributed stream admission control that exploits network capacity for rate allocation. We have extensively tested QVS on a 20-node network deployment. Our experimental results show that QVS delivers satisfactory voice quality under a range of realistic settings while achieving high network capacity utilization.  &lt;p&gt;&lt;a href="http://www.cse.msu.edu/publications/tech/TR/MSU-CSE-09-9.pdf"&gt;PDF&lt;/a&gt;</description><pubDate>2009-11-23</pubDate></item><item><title>Latent Fingerprint Matching</title><link>http://www.cse.msu.edu/cgi-user/web/tech/document?ID=931</link><description>Authors: &lt;a href="search?ANDOR=&amp;amp;AUTHORID=2"&gt;Anil K. Jain&lt;/a&gt;, &lt;a href="search?ANDOR=&amp;amp;AUTHORID=668"&gt;Jianjiang Feng&lt;/a&gt;&lt;P&gt; Latent fingerprint identification is of critical importance to law enforcement agencies in identifying suspects. While tremendous progress has been made in plain and rolled fingerprint matching, latent fingerprint matching continues to be a difficult problem. Latent fingerprints are inadvertent impressions left by fingers on surfaces of objects. Poor quality of ridge impressions, small finger area and large non-linear distortion are the main difficulties in latent fingerprint matching, compared to plain or rolled fingerprint matching. We propose a system for matching latent fingerprints to rolled fingerprints that is needed in forensics applications. In addition to minutiae, we also use extended features, including singularity, ridge quality map, ridge flow map, ridge wavelength map and skeleton. Our system was tested by matching 258 latents in NIST SD27 database against a background database of 29,257 rolled fingerprints obtained by combining NIST SD4, SD14 and SD27 databases. The minutiae-based baseline rank-1 identification rate of 34.9% was improved to 74% when extended features are used. In order to evaluate the relative importance of each extended feature, these features are incrementally used in the order of their cost in marking by latent experts. The experimental results indicate that singularity, ridge quality map and ridge flow map are the most effective features in improving the matching accuracy.  </description><pubDate>2009-11-23</pubDate></item><item><title>Data Clustering: 50 Years Beyond K-Means</title><link>http://www.cse.msu.edu/cgi-user/web/tech/document?ID=932</link><description>Authors: &lt;a href="search?ANDOR=&amp;amp;AUTHORID=2"&gt;Anil K. Jain&lt;/a&gt;&lt;P&gt; Organizing data into sensible groupings is one of the most fundamental modes of understanding and learning. As an example, a common scheme of scientific classification puts organisms into taxonomic ranks: domain, kingdom, phylum, class, etc.). Cluster analysis is the formal study of algorithms and methods for grouping, or clustering, objects according to measured or perceived intrinsic characteristics or similarity. Cluster analysis does not use category labels that tag objects with prior identifiers, i.e., class labels. The absence of category information distinguishes data clustering (unsupervised learning) from classification or discriminant analysis (supervised learning). The aim of clustering is exploratory in nature to find structure in data. Clustering has a long and rich history in a variety of scientific fields. One of the most popular and simple clustering algorithms, K-means, was first published in 1955. In spite of the fact that K-means was proposed over 50 years ago and thousands of clustering algorithms have been published since then, K-means is still widely used. This speaks to the difficulty of designing a general purpose clustering algorithm and the illposed problem of clustering. We provide a brief overview of clustering, summarize well known clustering methods, discuss the major challenges and key issues in designing clustering algorithms, and point out some of the emerging and useful research directions, including semi-supervised clustering, ensemble clustering, simultaneous feature selection, and data clustering and large scale data clustering.  </description><pubDate>2009-11-23</pubDate></item><item><title>Evolving Quorum Sensing in Digital Organisms</title><link>http://www.cse.msu.edu/cgi-user/web/tech/document?ID=933</link><description>Authors: &lt;a href="search?ANDOR=&amp;amp;AUTHORID=488"&gt;Benjamin Beckmann&lt;/a&gt;, &lt;a href="search?ANDOR=&amp;amp;AUTHORID=254"&gt;Philip K. McKinley&lt;/a&gt;&lt;P&gt; For centuries it was thought that bacteria live asocial lives. However, recent discoveries show many species of bacteria communicate in order to perform tasks previously thought to be limited to multicellular organisms. Central to this capability is quorum sensing, whereby organisms detect cell density and use this information to trigger group behaviors. Quorum sensing is used by bacteria in the formation of bio films, secretion of digestive enzymes and, in the case of pathogenic bacteria, release of toxins or other virulence factors. Indeed, methods to disrupt quorum sensing are currently being investigated as possible treatments for numerous diseases, including cystic fibrosis, epidemic cholera, and methicillin-resistant Staphylococcus aureus (MRSA). In this paper we demonstrate the evolution of a quorum sensing behavior in populations of digital organisms. Speci fically, we show that digital organisms are capable of evolving a strategy to collectively suppress self-replication, when the population density reaches a specifi c, evolved threshold. We present the evolved genome of an organism exhibiting this behavior and analyze the collective operation of this algorithm. Finally, through a set of experiments we demonstrate that the behavior scales to populations up to 400 times larger than those in which the behavior evolved.  </description><pubDate>2009-11-23</pubDate></item><item><title>Evolution of Probabilistic Consensus in Digital Organisms</title><link>http://www.cse.msu.edu/cgi-user/web/tech/document?ID=934</link><description>Authors: &lt;a href="search?ANDOR=&amp;amp;AUTHORID=484"&gt;David B. Knoester&lt;/a&gt;, &lt;a href="search?ANDOR=&amp;amp;AUTHORID=254"&gt;Philip K. McKinley&lt;/a&gt;&lt;P&gt; The complexity of distributed computing systems and their increasing interaction with the physical world impose challenging requirements in terms of adaptation, robustness, and resilience to attack. Based on their reliance on heuristics, algorithms for consensus, where members of a group agree on a course of action, are particularly sensitive to these conditions. Given the ability of natural organisms to respond to adversity, many researchers have investigated biologically-inspired approaches to designing robust distributed systems. Examples include biomimetics, which mimic behaviors such as swarming found in nature, as well as evolutionary computation methods, such as genetic algorithms and neuroevolution, which simulate the natural processes that produce those behaviors. A related but fundamentally different technique is digital evolution, a type of artificial life system whereby a population of self-replicating computer programs exists in a user-defined computational environment and is subject to instruction-level mutations and natural selection. Over thousands of generations, these digital organisms can evolve to survive, and thrive, under extremely dynamic and adverse conditions. In this paper, we describe a study in the use of digital evolution to produce a distributed behavior for reaching consensus. The evolved algorithm employs a novel mechanism for probabilistically reaching consensus based on the frequency of messaging. Moreover, this design approach enables us to change parameters based on the specifics of the desired system, with evolution producing corresponding flavors of consensus algorithms. Our results demonstrate that artificial life systems can be used to discover solutions to engineering problems, and that experiments in artificial life can inspire new studies in distributed protocol development.  </description><pubDate>2009-11-23</pubDate></item><item><title>An Efficient Key Point Quantization Algorithm for Large Scale Image Retrieval</title><link>http://www.cse.msu.edu/cgi-user/web/tech/document?ID=937</link><description>Authors: &lt;a href="search?ANDOR=&amp;amp;AUTHORID=698"&gt;Fengjie Li&lt;/a&gt;, &lt;a href="search?ANDOR=&amp;amp;AUTHORID=699"&gt;Wei Tong&lt;/a&gt;, &lt;a href="search?ANDOR=&amp;amp;AUTHORID=490"&gt;Anil Jain&lt;/a&gt;, &lt;a href="search?ANDOR=&amp;amp;AUTHORID=409"&gt;Rong Jin&lt;/a&gt;, &lt;a href="search?ANDOR=&amp;amp;AUTHORID=652"&gt;Jung-Eun Lee&lt;/a&gt;&lt;P&gt; Local image features, such as SIFT descriptors, have been shown to be more effective for content-based image retrieval (CBIR) than global image features. They represent each image by a bag of low-level image features that are often referred to as key points. The similarity between two images is decided by the similarity between the associated sets of key points. The most popular approach for key point based image similarity measure is the clustering-based bag-of-words model. It maps each key point to a visual word in a codebook that is constructed by a clustering algorithm, and represents each image by a histogram of visual words. Despite its success, there are three main shortcomings of the clustering-based bag-of-words model: (i) it is computationally expensive to cluster millions of key points into thousands of visual words; (ii) there is no theoretical analysis on the performance of the bag-of-words model; (iii) it usually does not allow for partial matching between key points. We propose a new scheme for key point quantization that addresses these shortcomings. Instead of clustering, the proposed scheme quantizes each key point into a vector by a collection of randomly generated hyper-spheres, and a bag-of-model is constructed for each image by summing the quantization vectors of all its key points. Our theoretical analysis shows that the resulting image similarity provides an upper bound for the similarity based on the optimal partial matching between two sets of key points. Empirical study on a database of $100,000$ tattoo images with ten million key points shows that (i) the proposed scheme is significantly more efficient than the clustering-based approach for key point quantization, and (ii) it achieves better retrieval accuracy than the clustering-based approach.  </description><pubDate>2009-11-23</pubDate></item><item><title>Facial Marks: Soft Biometric For Face Recognition</title><link>http://www.cse.msu.edu/cgi-user/web/tech/document?ID=938</link><description>Authors: &lt;a href="search?ANDOR=&amp;amp;AUTHORID=2"&gt;Anil K. Jain&lt;/a&gt;, &lt;a href="search?ANDOR=&amp;amp;AUTHORID=491"&gt;Unsang Park&lt;/a&gt;&lt;P&gt; Face Recognition systems have incorporated both global and local features. We propose to utilize micro features, namely facial marks (e.g., speckles, moles, and scars) to further improve face recognition and retrieval performance. Facial marks can be used in three ways: i) to supplement the matching results of an existing face matcher, ii) to enable a fast retrieval based on facial marks, and iii) to enable matching or retrieval from a partial or profile face image with marks. In order to automatically detect these marks, we first use Active Appearance Model (AAM) to locate and segment primary facial features (e.g., eyes, nose, and mouth). Laplacian-of-Gaussian (LoG) and morphological operators are used on the face region without primary facial features to detect facial marks. Experimental results based on 1,225 images (mugshots) of 671 subjects show that the use of facial marks improves the rank-1 identification accuracy of a state-of-the-art face recognition system from 91.88\% to 93.14\%.  </description><pubDate>2009-11-23</pubDate></item><item><title>TCAM SPliT: Optimizing Space, Power, and Throughput for TCAM-based Packet Classification Systems</title><link>http://www.cse.msu.edu/cgi-user/web/tech/document?ID=939</link><description>Authors: &lt;a href="search?ANDOR=&amp;amp;AUTHORID=671"&gt;Chad R. Meiners&lt;/a&gt;, &lt;a href="search?ANDOR=&amp;amp;AUTHORID=700"&gt;Alex X Liu&lt;/a&gt;, &lt;a href="search?ANDOR=&amp;amp;AUTHORID=124"&gt;Eric Torng&lt;/a&gt;&lt;P&gt; Packet classification is the core mechanism that enables many networking services such as packet filtering and traffic accounting. Using Ternary Content Addressable Memories (TCAMs) to perform high-speed packet classification has become the \emph{de facto} standard in industry because TCAMs can facilitate constant time classification by comparing a packet with all rules of ternary encoding in parallel. Despite their high speed, TCAMs have limitations of small capacity, large power consumption, and relatively slow access times. The well-known range expansion problem in converting range rules to ternary rules significantly exacerbates these TCAM limitations. While we can expect some gain in TCAM performance from improved hardware, the demands on TCAM performance as measured by the number of rules in packet classifiers increase far more rapidly due to the explosive growth of Internet services and threats. One reason TCAM-based packet classifiers are so large is the multiplicative effect inherent in representing $d$-dimen\-sional classifiers in TCAMs. Although several TCAM optimization techniques have been proposed, they cannot fully address the multiplicative effect because they still produce five-dimensional classifiers. To address the multiplicative effect, we abandon the traditional TCAM classification paradi\-gm that performs a single five-dimensional lookup on a single five-dimensional classifier stored in a large TCAM. Instead, we propose the TCAM SPliT architecture, where a five dimensional classifier is split into two low dimensional classifiers, each of which is stored on its own small TCAM. Accordingly, a five-dimensional lookup is split into two low dimensional, pipelined lookups with one lookup on each chip. TCAM SPliT is the first scheme that fundamentally addresses the multiplicative effect in TCAM-based packet classification by decomposing a high dimensional classifier into two low dimensional classifiers. TCAM SPliT significantly improves TCAM-based packet classification in the three aspects of space, power, and throughput. On real-world classifiers, for space, TCAM SPliT reduces space requirement by an average of $92\%$; for power, TCAM SPliT reduces power consumption by an average of $37.9\%$; for throughput, TCAM SPliT improves classification throughput by an average of $63.3\%$. Assuming the average compression ratio of 8\% applies to large classifiers, we extrapolate that TCAM SPliT can reduce power consumption by more than a factor of 5 and increase throughput by more than a factor of 2.5 when applied to a classifier whose direct expansion requires a 36Mb TCAM, which is the largest TCAM chip available.  </description><pubDate>2009-11-23</pubDate></item><item><title>Impact of Data Fusion on Real-Time Detection in Sensor Networks</title><link>http://www.cse.msu.edu/cgi-user/web/tech/document?ID=940</link><description>Authors: &lt;a href="search?ANDOR=&amp;amp;AUTHORID=678"&gt;Rui Tan&lt;/a&gt;, &lt;a href="search?ANDOR=&amp;amp;AUTHORID=679"&gt;Guoliang Xing&lt;/a&gt;, &lt;a href="search?ANDOR=&amp;amp;AUTHORID=683"&gt;Benyuan Liu&lt;/a&gt;, &lt;a href="search?ANDOR=&amp;amp;AUTHORID=682"&gt;Jianping Wang&lt;/a&gt;&lt;P&gt; Real-time detection is an important requirement of many mission-critical wireless sensor network applications such as battlefield monitoring and security surveillance. Due to the high network deployment cost, it is crucial to understand and predict the real-time detection capability of a sensor network. However, most existing real-time analyses are based on overly simplistic sensing models (\eg, the disc model) that do not capture the stochastic nature of detection. In practice, data fusion has been adopted in a number of sensor systems to deal with sensing uncertainty and enable the collaboration among sensors. However, real-time performance analysis of sensor networks designed based on data fusion has received little attention. In this paper, we bridge this gap by investigating the fundamental real-time detection performance of large-scale sensor networks under stochastic sensing models. Our results show that data fusion is effective in achieving stringent performance requirements such as short detection delay and low false alarm rates, especially in the scenarios with low \acp{snr}. Compared with the disc model, data fusion can reduce the network density by about $60\%$ while the false alarm rate is lower than $1\%$ and any intruder can be detected within one detection period. The disc model is only suitable when the \ac{snr} is sufficiently high. Our results help understand the impact of data fusion and provide important guidelines for the design of real-time wireless sensor networks for intrusion detection.  &lt;p&gt;&lt;a href="http://www.cse.msu.edu/publications/tech/TR/MSU-CSE-09-19.pdf"&gt;PDF&lt;/a&gt;</description><pubDate>2009-11-23</pubDate></item><item><title>Symbolic Synthesis of Masking Fault-Tolerant Distributed Programs</title><link>http://www.cse.msu.edu/cgi-user/web/tech/document?ID=941</link><description>Authors: &lt;a href="search?ANDOR=&amp;amp;AUTHORID=465"&gt;Borzoo Bonakdarpour&lt;/a&gt;, &lt;a href="search?ANDOR=&amp;amp;AUTHORID=300"&gt;Sandeep S. Kulkarni&lt;/a&gt;, &lt;a href="search?ANDOR=&amp;amp;AUTHORID=472"&gt;Fuad Abujarad&lt;/a&gt;</description><pubDate>2009-11-23</pubDate></item><item><title>Constraint Based Automated Synthesis of Nonmasking and Stabilizing Fault-Tolerance</title><link>http://www.cse.msu.edu/cgi-user/web/tech/document?ID=942</link><description>Authors: &lt;a href="search?ANDOR=&amp;amp;AUTHORID=472"&gt;Fuad Abujarad&lt;/a&gt;, &lt;a href="search?ANDOR=&amp;amp;AUTHORID=300"&gt;Sandeep S. Kulkarni&lt;/a&gt;&lt;P&gt; We focus on constraint-based automated addition of nonmasking and stabilizing fault-tolerance to hierarchical programs. We specify legitimate states of the program in terms of constraints that should be satisfied in those states. To deal with faults that may violate these constraints, we add recovery actions while ensuring interference freedom among the recovery actions added for satisfying different constraints. Since the constraint-based manual design of fault-tolerance is well-known to be applicable in the manual design of nonmasking fault-tolerance, we expect our approach to have a significant benefit in automation of fault-tolerant programs. We illustrate our algorithms with three case studies: stabilizing mutual exclusion, stabilizing diffusing computation, and a data dissemination problem in sensor networks. With experimental results, we show that the complexity of synthesis is reasonable and that it can be reduced using the structure of the hierarchical systems. To our knowledge, this is the first instance where automated synthesis has been successfully used in synthesizing programs that are correct under fairness assumptions. Moreover, in two of the case studies considered in this paper, the structure of the recovery paths is too complex to permit existing heuristic based approaches for adding recovery.  </description><pubDate>2009-11-23</pubDate></item><item><title>A Goal-Based Modeling Approach to Develop Requirements for Adaptive Systems with Environmental Uncertainty</title><link>http://www.cse.msu.edu/cgi-user/web/tech/document?ID=943</link><description>Authors: &lt;a href="search?ANDOR=&amp;amp;AUTHORID=460"&gt;Betty H.C. Cheng&lt;/a&gt;, &lt;a href="search?ANDOR=&amp;amp;AUTHORID=701"&gt;Pete Sawyer&lt;/a&gt;, &lt;a href="search?ANDOR=&amp;amp;AUTHORID=702"&gt;Nelly Bencomo&lt;/a&gt;, &lt;a href="search?ANDOR=&amp;amp;AUTHORID=703"&gt;Jon Whittle&lt;/a&gt;&lt;P&gt; Dynamically adaptive systems (DASs) are intended to mon- itor the execution environment and then dynamically adapt their behav- ior in response to changing environmental conditions. The uncertainty of the execution environment is a ma jor motivation for dynamic adap- tation; it is impossible to know at development time all of the possible combinations of environmental conditions that will be encountered. To date, the work performed in requirements engineering for a DAS includes requirements monitoring and reasoning about the correctness of adap- tations, where the DAS requirements are assumed to exist. This paper introduces a goal-based modeling approach to develop the requirements for a DAS, while explicitly factoring uncertainty into the process and resulting requirements. We introduce a variation of threat modeling to identify sources of uncertainty and demonstrate how the RELAX specifi- cation language can be used to specify more flexible requirements within a goal model to handle the uncertainty.  &lt;p&gt;&lt;a href="http://www.cse.msu.edu/publications/tech/TR/MSU-CSE-09-22.pdf"&gt;PDF&lt;/a&gt;</description><pubDate>2009-11-23</pubDate></item><item><title>A Semi-supervised Framework for Simultaneous Regression and Classification of Time Series with Application to Precipitation Prediction</title><link>http://www.cse.msu.edu/cgi-user/web/tech/document?ID=944</link><description>Authors: &lt;a href="search?ANDOR=&amp;amp;AUTHORID=704"&gt;Zubin Abraham&lt;/a&gt;, &lt;a href="search?ANDOR=&amp;amp;AUTHORID=419"&gt;Pang-Ning Tan&lt;/a&gt;&lt;P&gt; Many time series forecasting problems involve skewed time series, where many of the real-valued observations are zeros. Due to the skewed distribution, current regression models tend to underestimate the future prediction values. To overcome this problem, we present a novel semi-supervised learning framework that simultaneously combines a classification model (to predict whether the observation value is exactly zero) and a regression model (to predict the actual value of the non-zero observation). We demonstrate the effectiveness of the framework in terms of its application to precipitation prediction for climate modeling.  </description><pubDate>2009-11-23</pubDate></item><item><title>Multi-channel Interference Measurement and Modeling in Low-Power Wireless Networks</title><link>http://www.cse.msu.edu/cgi-user/web/tech/document?ID=946</link><description>Authors: &lt;a href="search?ANDOR=&amp;amp;AUTHORID=706"&gt;Guoliang Xing; Mo Sha; Jun Huang; Gang Zhou; Xiaorui Wang; Shuch Liu&lt;/a&gt;&lt;P&gt; Multi-channel design has received significant attention for low-power wireless networks (LWNs), such as 802.15.4-based wireless sensor networks, due to its potential of mitigating interference and improving network capacity. However, recent studies reveal that the number of orthogonal channels available on commodity wireless platforms is small, which significantly hinders the performance of existing multichannel protocols. A promising solution is to explore the use of partially overlapping channels for communications. However, this approach faces several key challenges such as increased inter-channel interference and significantly higher overhead of channel measurement. In this paper, we systematically study the inter-channel interference and its impact on link capacity and the performance of multi-channel protocols in LWNs. First, we develop empirical models for characterizing inter-channel signal attenuation based on experiments on TelosB motes. We then propose a novel algorithm which can significantly reduce the overhead of multi-channel interference measurement by exploiting the spectral power density (SPD) of the transmitter. Finally, we apply our interference models to both link capacity analysis and channel assignment protocols. Our extensive experiments on a testbed of 30 TelosB motes show that our interference measurement algorithm has an average error of 2.95%. Our results also demonstrate that multi-channel protocols for LWNs can significantly benefit from using overlapping channels.  &lt;p&gt;&lt;a href="http://www.cse.msu.edu/publications/tech/TR/MSU-CSE-09-25.pdf"&gt;PDF&lt;/a&gt;</description><pubDate>2009-11-23</pubDate></item><item><title>Privacy and Integrity Preserving Range Queries in Sensor Networks</title><link>http://www.cse.msu.edu/cgi-user/web/tech/document?ID=947</link><description>Authors: &lt;a href="search?ANDOR=&amp;amp;AUTHORID=707"&gt;Fei Chen&lt;/a&gt;, &lt;a href="search?ANDOR=&amp;amp;AUTHORID=666"&gt;Alex X. Liu&lt;/a&gt;&lt;P&gt; The architecture of two-tiered sensor networks, where storage nodes serve as an intermediate tier between sensors and a sink for storing data and processing queries, has been widely adopted because of the benefits of power and storage saving for sensors as well as the efficiency of query processing. However, the importance of storage nodes also makes them attractive to attackers. In this paper, we propose SafeQ, a protocol that prevents attackers from gaining information from both sensor collected data and sink issued queries. SafeQ also allows a sink to detect compromised storage nodes when they misbehave. To preserve privacy, SafeQ uses a novel technique to encode both data and queries such that a storage node can correctly process encoded queries over encoded data without knowing their actual values. To preserve integrity, we propose a new data structure called neighborhood chaining that allows a sink to verify whether the result of a query contains exactly the data items that satisfy the query. To improve performance, we optimize SafeQ using Bloom filters. In addition, we propose a solution to adapt SafeQ for event-driven sensor networks. In comparison with prior art results, SafeQ excels in two aspects. First, SafeQ provides significantly better security and privacy. While prior art allows a compromised storage node to obtain a reasonable estimation on the value of sensor collected data and sink issued queries, SafeQ makes such estimation impossible. Second, SafeQ delivers significantly better performance on both power consumption and storage space. We performed extensive side-by-side comparison with prior art in our experiments. Results show that the power and space savings of SafeQ over prior art grow exponentially with the number of dimensions. Regarding power consumption, for two-dimensional data, our experimental results show that SafeQ consumes one order magnitude less power for sensors and storage nodes, and for three-dimensional data, SafeQ consumes two orders magnitude less power for sensors and storage nodes. Regarding space on storage nodes, for two-dimensional data, our experimental results show that SafeQ uses one order magnitude less space, and for three-dimensional data, SafeQ uses two orders less space.  </description><pubDate>2009-11-23</pubDate></item><item><title>Constraint Based Automated Synthesis of Nonmasking and Stabilizing Fault-Tolerance</title><link>http://www.cse.msu.edu/cgi-user/web/tech/document?ID=948</link><description>Authors: &lt;a href="search?ANDOR=&amp;amp;AUTHORID=472"&gt;Fuad Abujarad&lt;/a&gt;, &lt;a href="search?ANDOR=&amp;amp;AUTHORID=300"&gt;Sandeep S. Kulkarni&lt;/a&gt;&lt;P&gt; We focus on constraint-based automated addition of nonmasking and stabilizing fault-tolerance to hierarchical programs. We specify legitimate states of the program in terms of constraints that should be satisfied in those states. To deal with faults that may violate these constraints, we add recovery actions while ensuring interference freedom among the recovery actions added for satisfying different constraints. Since the constraint-based manual design of fault-tolerance is well-known to be applicable in the manual design of nonmasking fault-tolerance, we expect our approach to have a significant benefit in automation of fault-tolerant programs. We illustrate our algorithms with three case studies: stabilizing mutual exclusion, stabilizing diffusing computation, and a data dissemination problem in sensor networks. With experimental results, we show that the complexity of synthesis is reasonable and that it can be reduced using the structure of the hierarchical systems. To our knowledge, this is the first instance where automated synthesis has been successfully used in synthesizing programs that are correct under fairness assumptions. Moreover, in two of the case studies considered in this paper, the structure of the recovery paths is too complex to permit existing heuristic based approaches for adding recovery.  &lt;p&gt;&lt;a href="http://www.cse.msu.edu/publications/tech/TR/MSU-CSE-09-27.pdf"&gt;PDF&lt;/a&gt;</description><pubDate>2009-11-23</pubDate></item><item><title>Efficient Large Scale Image Retrieval: A Statistical Modeling Approach</title><link>http://www.cse.msu.edu/cgi-user/web/tech/document?ID=949</link><description>Authors: &lt;a href="search?ANDOR=&amp;amp;AUTHORID=699"&gt;Wei Tong&lt;/a&gt;, &lt;a href="search?ANDOR=&amp;amp;AUTHORID=698"&gt;Fengjie Li&lt;/a&gt;, &lt;a href="search?ANDOR=&amp;amp;AUTHORID=409"&gt;Rong Jin&lt;/a&gt;, &lt;a href="search?ANDOR=&amp;amp;AUTHORID=490"&gt;Anil Jain&lt;/a&gt;&lt;P&gt; Local image features, such as SIFT descriptors, have been shown to be effective for content-based image retrieval (CBIR). A large number of studies are devoted to efficiently identifying visually similar images from a database for given queries using local image features. In order to achieve efficient image retrieval, most of these approaches require the quantization of local image features. Since vector quantization often results in the lost of information in local image features, these approaches tend to yield suboptimal performance compared to the approaches that directly use the image features without quantization. In this work, we present a statistical modeling framework for large scale image retrieval that aim to achieve efficient image retrieval without having to quantize the image features. The proposed framework estimates a kernel density function for each image in the database, and computes the similarity of images to a query image by query log-likelihood. Efficient algorithms are presented for kernel density estimation and effective image matching. Experiments with large-scale image retrieval and automated image annotation confirm that the proposed method is both efficient and effective for identifying visually similar images for given queries from large image databases.  </description><pubDate>2009-11-23</pubDate></item></channel></rss>
