Friday, January 16, 2009

An Introduction to Conditional Random Fields for Relational Learning

by Charles Sutton and Andrew McCallum (umass)

[Sutton07a] C. Sutton and A. McCallum. An Introduction to Conditional Random Fields for Relational Learning. In L.Getoor and B. Taskar, editors. Introduction to Statistical Relational Learning.MIT Press, 2007. [ url ]

Good intro to conditional random fields (CRFs)!

relation of naive Bayes to logistic regression, directed to undirected graphical models, Markov chains (HMMs) to linear-chain CRFS.

role of p(x) in generative models; lack thereof in, relaxation of model parameters, and freedom from (in)dependence assumptions in discriminative models.

Thursday, January 15, 2009

Project Athena as a Distributed Computer System

[Champine90] George A. Champine, Daniel E. Geer, Jr., and William N. Ruh, “Project Athena as a Distributed Computer System,” IEEE Computer, IEEE, vol. 23, September 1990, 40-50,

This paper describes an implemented distributed system, Athena, at MIT for students. It also compares this system to other distributed OS's. Supported by DEC and IBM

Assumed that computers were too expensive, to be bought by students but possibly cheaper in the future.

Key requirements:

* Scalability: must scale up to 10,000+ computers
* Reliability: must be available 24/7 even when some components fail.
* Public work stations: any user at any workstation.
* Security: System services must be secure even though individual workstations are not.
* Heterogeneity: The system must support a variety of hardware platforms.
* Coherency: All system applications must run on all workstations. Consistent look and feel.
* Affordability: low cost to own and operate.

Definitions:
user
client
server
service
name
binding
resolving
coherence
interoperability
authentication
authorization
fail-soft

mainframe model vs unified model
- security, resource allocation, privacy/network, mail, maintenance

other distributed os's:
* Amoeba
* Andrew
* Dash
* Eden
* Grapevine
* HCS
* Locus
* Mach
* Sprite
* V

Athena in terms of the requirements

System comparisons

Issues:
* naming: hosts, printers, services, files, users; replication vs. partitioning
* scalability: anything that scales linearly is probably not feasible in general.
* Security: authentication and authorization. centralized and encrypted password checking service. public-key cryptography. key distribution servers. access control lists and capabilities based authorization
* Compatibility: binary level, execution level, and protocol level.


Athena system design
* name service: Hesiod. Berkeley bind. fast front end to moira.
* file service: NFS, AFS
* printing service
* mail service
* real-time notification: Zephyr
* service management: Moira: configuration for mail, disk quotas, hardware config, post office allocation, and access control lists
* authentication: Kerberos. Login and out. Tickets.
* installation and update
* online consulting, discuss


Design of Distributed Systems Supporting Local Autonomy

[Clark80] David D. Clark and Liba Svobodova, “Design of Distributed Systems Supporting Local Autonomy,” 20th IEEE COMPCON, IEEE, February 1980, 438-444,

Use analogy to real world organizations.

individual nodes cooperate in a standardized manner but maintain a fair degree of autonomy wrt their management and internal organization.

Predicts that this will be the most widely used paradigm for distributed systems.

As opposed to the Athena paper, which says that distribution is about cost, this paper says that distribution is fundamentally about the needs of the problem to which distribution is applied: many applications are naturally distributed.

Also, as opposed to the Athena paper, which is a realized implementation, this paper is a more theoretical paper.

Components: nodes (PCs), servers, communication substrate

Issues considered: efficiency, reliability, transaction integrity, and expandability.

there was a part that seemed anti-RPC, where they argued that the application programmer should know whether functions or data being used are local or remote (however they state that this may be hidden from the end user).


End-to-end Arguments In System Design

[Saltzer81] J. H. Saltzer, D. P. Reed, and D. D. Clark, “End-to-end Arguments In System Design,” ACM Transactions on Computer Systems, ACM Press, vol. 2, no. 4, 1984, 277-288,

This paper argues that excessive concern about component reliability is unnecessary, and that checks must be done at the application level regardless.