- Large-scale storage system (1)
- Machine learning (1)
- Relation extraction (1)
- Resource provisioning (1)
- Integrity check (1)
Articles 1 - 3 of 3
Full-Text Articles in Computer Engineering
A Probabilistic Software Framework For Scalable Data Storage And Integrity Check, Sisi Xiong
Data has overwhelmed the digital world in terms of volume, variety and velocity. Data- intensive applications are facing unprecedented challenges. On the other hand, computation resources, such as memory, suffer from shortage comparing to data scale. However, in certain applications, it is a must to process large amount of data in a time efficient manner. Probabilistic approaches are compromises between these three perspectives: large amount of data, limited computation resources and high time efficiency, in the sense that those approaches cannot guarantee 100% correctness, their error rates, however, are predictable and adjustable depending on available computation resources and time constraints ...
Achieving High Reliability And Efficiency In Maintaining Large-Scale Storage Systems Through Optimal Resource Provisioning And Data Placement, Lipeng Wan
With the explosive increase in the amount of data being generated by various applications, large-scale distributed and parallel storage systems have become common data storage solutions and been widely deployed and utilized in both industry and academia. While these high performance storage systems significantly accelerate the data storage and retrieval, they also bring some critical issues in system maintenance and management. In this dissertation, I propose three methodologies to address three of these critical issues.
First, I develop an optimal resource management and spare provisioning model to minimize the impact brought by component failures and ensure a highly operational experience ...
Universal Schema For Knowledge Representation From Text And Structured Data, Limin Yao
In data integration we transform information from a source into a target schema. A general problem in this task is loss of fidelity and coverage: the source expresses more knowledge than that can be fit into the target schema, or knowledge that is hard to fit into any schema at all. This problem is taken to an extreme in information extraction (IE) where the source is natural language---one of the most expressive forms of knowledge representation. To address this issue, one can either automatically learn a latent schema emergent in text (a brittle and ill-defined task), or manually define schemas ...