Implementing a Lambda Architecture to perform real-time updates

dc.contributor.authorGudipati, Pramod Kumar
dc.date.accessioned2016-11-17T21:55:47Z
dc.date.available2016-11-17T21:55:47Z
dc.date.graduationmonthDecemberen_US
dc.date.issued2016-12-01en_US
dc.date.published2016en_US
dc.description.abstractThe Lambda Architecture is the new paradigm for big data, that helps in data processing with a balance on throughput, latency and fault-tolerance. There exists no single tool that provides a complete solution in terms of better accuracy, low latency and high throughput. This initiated the idea to use a set of tools and techniques to build a complete big data system. The Lambda Architecture defines a set of layers to fit in a set of tools and techniques rightly for building a complete big data system: Speed Layer, Serving Layer, Batch Layer. Each layer satisfies a set of properties and builds upon the functionality provided by the layers beneath it. The Batch layer is the place where the master dataset is stored, which is an immutable and append-only set of raw data. Also, batch layer pre-computes results using a distributed processing system like Hadoop, Apache Spark that can handle large quantities of data. The Speed Layer captures new data coming in real time and processes it. The Serving Layer contains a parallel processing query engine, which takes results from both Batch and Speed layers and responds to queries in real time with low latency. Stack Overflow is a Question & Answer forum with a huge user community, millions of posts with a rapid growth over the years. This project demonstrates The Lambda Architecture by constructing a data pipeline, to add a new “Recommended Questions” section in Stack Overflow user profile and update the questions suggested in real time. Also, various statistics such as trending tags, user performance numbers such as UpVotes, DownVotes are shown in user dashboard by querying through batch processing layer.en_US
dc.description.advisorWilliam Hsuen_US
dc.description.degreeMaster of Scienceen_US
dc.description.departmentDepartment of Computing and Information Sciencesen_US
dc.description.levelMastersen_US
dc.identifier.urihttp://hdl.handle.net/2097/34517
dc.language.isoenen_US
dc.publisherKansas State Universityen
dc.subjectLambda Architectureen_US
dc.subjectBig Data
dc.subjectStack overflow
dc.subjectcomputer science
dc.titleImplementing a Lambda Architecture to perform real-time updatesen_US
dc.typeReporten_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
PramodKumarGudipati2016.pdf
Size:
2.75 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.62 KB
Format:
Item-specific license agreed upon to submission
Description: