Developing, Optimizing and Hosting Data-Driven Web Applications
Loading...
Files
No Access Until
Permanent Link(s)
Collections
Other Titles
Authors
Abstract
Building web applications using current systems is not an easy task
and we face the following challenges: (1) It is difficult to
program web applications on top of the standard three-tier
architecture. (2) Performance optimizations and tunings are mostly
done manually, which is tedious, error-prone and suboptimal. (3) It
is hard for non-technical users to construct web applications for
their own needs. (4) Current platforms do not scale to host
a large number of applications in a cost-effective, manageable and/or
flexible manner. In this thesis, we propose technologies to address
those challenges in developing, optimizing and hosting data-driven
web applications.
Data-Driven web applications are usually structured following the
standard three-tier architecture with different programming models
used at different tiers. This division not only creates an impedance
mismatch problem for developers but also forces them to manually
partition application logic across tiers, which results in complex
logic, suboptimal system design, and expensive re-partitioning of
applications as systems evolve. We propose a unified development
platform based on HILDA, a high-level language for developing
data-driven web applications. The primary benefits of HILDA over
existing development platforms are: (a) it uses a unified
data and programming model for all layers of the application, (b) it is
declarative, (c) it enables conflict detection for concurrent
updates, (d) it supports structured programming for web sites, (e) it
separates application logic from presentation. Instead
of using different languages for different layers, developers build the whole
application in HILDA. HILDA code is translated into executables
that run on top of the three-tier architecture. The runtime system
automatically partitions the application logic between tiers based on runtime properties of the application, to
optimize the system performance while obeying memory constraints
at the clients. We evaluate our methodology with traces from a real
Course Management System used at Cornell University as well as an
online bookstore from the TPC-W benchmark. The results show that
automatic partitioning outperforms manual partitioning without the
associated development overhead.
There are many cases where non-technical users want to build
data-driven web applications to fit their own needs. An emerging
trend in Social Networking sites and Web portals is the opening up
of APIs to external application developers. For example, the
Facebook Platform, Google Gadgets and Yahoo! Widgets allow users to
design their own applications, which can then can be integrated with
the platform and shared with others. However, current APIs are
targeted towards developers with programming expertise and database
knowledge; they are not accessible to a large class of users who do
not have a programming/database background but would nevertheless
like to create new applications. To address this need, we have
developed the AppForge system, which provides a WYSIWYG application
development platform. Users can graphically specify the components
of webpages inside a Web browser, and the corresponding database
schema and application logic are automatically generated on the
fly by the system. The WYSIWYG interface gives instantaneous
feedback on what users just created and allows them to run, test and
continuously refine their applications and greatly lower the bar for building such applications.
While each user-generated application by itself is quite small (in
terms of size and throughput requirements), there are many such
applications and existing data management solutions are not designed
to handle this form of scalability in a cost-effective, manageable
and/or flexible manner. For instance, large installations of
commercial database systems such as Oracle, DB2 and SQL Server are
usually very expensive and difficult to manage. At the other
extreme, low-cost data hosting solutions such as Amazon's SimpleDB
do not support sophisticated data manipulation primitives such as
joins that are necessary for developing most Web applications. To
address this issue, we explore a new point in the design space
whereby we use commodity hardware and free software (MySQL) to scale
to a large number of applications while still supporting full SQL
functionality, transactional guarantees, high availability and
Service Level Agreements (SLAs). We do so by exploiting the key
property that each application is ``small'' and can fit in a single
machine (which can possibly be shared with other applications).
Using this property, we design replication strategies, data
migration techniques and load balancing operations that automate the
tasks that would otherwise contribute to the operational and
management complexity of dealing with a large number of
applications. We have conducted extensive experiments, based on the
TPC-W benchmark data sets and workloads, to study the performance
aspects of our system. Our experiments demonstrate that our system
can host a very large number of Web applications and provide them
rich functionality, strong consistency, high performance, high
availability and data protection in an inexpensive manner by using
commodity hardware and software components.
Journal / Series
Volume & Issue
Description
Sponsorship
This work supported by the National Science Foundation under Grant No. 534404.
Date Issued
2008-07-24T20:41:05Z
Publisher
Keywords
Database; Data-Driven Web Application; Performance; Scalability; WYSYWYG; Optimization
Location
Effective Date
Expiration Date
Sector
Employer
Union
Union Local
NAICS
Number of Workers
Committee Chair
Committee Co-Chair
Committee Member
Degree Discipline
Degree Name
Degree Level
Related Version
Related DOI
Related To
Related Part
Based on Related Item
Has Other Format(s)
Part of Related Item
Related To
Related Publication(s)
Link(s) to Related Publication(s)
References
Link(s) to Reference(s)
Previously Published As
Government Document
ISBN
ISMN
ISSN
Other Identifiers
Rights
Rights URI
Types
dissertation or thesis