Our group welcomes collaboration from both the database community and other fields in Computer Science.
Abstract: Answers to database queries often form the basis for
critical decision-making. To improve efficiency and reliability, answers to
these queries can be provided by distributed servers close to the querying
clients. However, because of the servers' ubiquity, the logistics associated
with fully securing them may be prohibitive; moreover, when the servers are run
by third parties, the clients may not trust them as much as they trust the
original data owners. Thus, the authenticity of the answers provided by servers
in response to clients' queries must be verifiable by the clients. More
generally, database responses are more useful if they contain the evidence of
their own correctness. For example, this enables a consumer to provide her own
credit report to a creditor without having the creditor request it from the
reporting agency to establish the validity of the report. This project is
developing methods for authenticating the validity and authenticity of a
variety of database queries, including general relational, data cube, and
spatio-temporal queries. Furthermore, techniques that use powerful
cryptographic primitives are being developed for providing authentication and
confidentiality. This research will enable utilization of this infrastructure
in applications where users must rely on the authenticity of the answer, such
as in financial systems, network monitoring, traffic control, or applications
yet to be imagined. The results of this project will be disseminated through
publications in journals and conferences. Furthermore, source code of these
methods, in the form of libraries, will be made available over the web.
Funding: Funded by NSF Cyber Trust Program under the project
"CT-ISG:
Collaborative Research: Towards Trustworthy Database Systems ", PI,
Feifei Li, 10/01/08-09/30/11, $150,620.
Abstract: When dealing with massive quantities of data, ranking and
aggregate queries are powerful techniques for focusing attention on the most
important answers. Many applications that produce such massive quantities of
data inherently introduce uncertainty in the same time, for example,
probabilistic match in data integration, imprecise measurements from sensors,
fuzzy duplicates in data cleaning, inconsistency in scientific data. Hence, the
importance of these queries is even greater in probabilistic data, where a
relation can encode exponentially many possible worlds. Uncertainty opens the
gate to many possible definitions for ranking and aggregate queries. With the
wide presence of probabilistic data, processing ranking and aggregate queries
efficiently with the right semantics is of key importance for the successful
deployment of probabilistic databases.
Funding: Funded by NSF IIS Program under the project
"
III:Small:Efficient Ranking and Aggregate Query Processing for Probablistic Data", sole PI,
Feifei Li, 09/01/09-08/31/12, $328,831.
Abstract: Sspatial databases have offered a large number of
applications in last decade that shape the horizon of computing services from
people’s daily life to scientific research. For example, people rely on
online map services to plan their trips; the deployment and query processing in
large sensor networks often require the design of location-aware algorithms.
Driven by these increasing number of applications, efficient processing of
important and novel query types in spatial databases has always been a focal
point. We are interested at designing geometric-aware query processing
algorithms for spatial and multimedia databases. More specifically, by taking
into account the particular geometric properties pertained to the application
at hand, our goal is to deliver practical, efficient and effective algorithms
for real applications on large scale data sets.
Funding: Funded by NSF under the project
"CAREER:
Core-Sets for Geometric Opimization and their Applications ", PI,
Piyush Kumar, 01/16/07-01/15/12, $400,000.