网站运维

（美）阿尔斯帕瓦，（美）罗宾斯　　著 东南大学出版社

出版时间：

2011-1

出版社：

东南大学出版社

作者：

（美）阿尔斯帕瓦，（美）罗宾斯　　著

页数：

315

Tag标签：

无

内容概要

　　网络应用牵涉到很多专业人土，而网站运维人员必须确保应用的每一部分在其整个生命周期中都能正常工作。当初创公司遭遇了未曾预期的访问流量尖峰，或者当某个新特性导致成熟应用失效时，你就需要这样的专业知识。在这部文章和访谈集中，网站运维老手theo
schlossnagle、baron schwartz和alistair
croll向这个日新月异的领域提供了他们的真知灼见。你还将学到如何使网站蓬勃发展的秘诀，这是来自·最大规模网站建?者的第一手资料。
　　 ·学习网站运维技能，了解这些技巧来自于经验而非学校教育的原因
　　 ·理解为何从应用程序和基础设施收集统计数据都很重要
　　 ·为数据库架构和规模日益增长带来的隐患考虑通用的处理方法
　　 ·学习如何处理宕机和降级相关的人为因素
　　 ·找到在蜂拥而至的巨大流量后避免灾难的方法
　　 ·问题发生后了解症结所在，防止其再次发生

作者简介

作者：（美国）阿尔斯帕瓦（John Allspaw）（美国）罗宾斯（Jesse Robbins）

书籍目录

foreword
preface
1 web operations: the career
　theo schlossnagle
　why does web operations have it tough?
　from apprentice to master
　conclusion
2 how picnik uses cloud computing: lessons learned
　justin huff
　where the cloud fits (and why!)
　where the cloud doesn't fit (for picnik)
　conclusion
3 infrastructure and application metrics
　john aiispaw, with matt massie
　time resolution and retention concerns
　locality of metrics collection and storage
　layers of metrics
　providing context for anomaly detection and alerts
　log lines are metrics, too
　correlation with change management and incident timelines
　making metrics available to your alerting mechanisms
　using metrics to guide load-feedback mechanisms
　a metrics collection system, illustrated: ganglia
　conclusion
4 continuous deployment
　eric ries
　small batches mean faster feedback
　small batches mean problems are instantly localized
　small batches reduce risk
　small batches reduce overhead
　the quality defenders' lament
　getting started
　continuous deployment is for mission-critical
　applications
　conclusion
5 infrastructure as code
　adam jacob
　service-oriented architecture
　conclusion
6 monitoring
　patrick debois
　story: "the start of a journey"
　step 1: understand what you are monitoring
　step 2: understand normal behavior
　step 3: be prepared and learn
　conclusion
7 how complex systems fail
　john aiispaw and richard cook
　how complex systems fail
　further reading
8 community management and web operations
　heather champ and john aiispaw
9 dealing with unexpected traffic spikes
　brian moon
　how it all started
　alarms abound
　putting out the fire
　surviving the weekend
　preparing for the future
　cdn to the rescue
　proxy servers
?corralling the stampede
　streamlining the codebase
　how do we know it works?
　the real test
　lessons learned
　improvements since then
10 dev and cps collaboration and cooperation
　paul hammond
　deployment
　shared, open infrastructure
　trust
　on-call developers
　avoiding blame
　conclusion
11 how your visitors feel: user-facing metrics
　alistair croll and sean power
　why collect user-facing metrics?
　what makes a site slow?
　measuring delay
　building an sla
　visitor outcomes: analytics
　other metrics marketing cares about
　how user experience affects web cps
　the future of web monitoring
　conclusion
12 relational database strategy and tactics for the web
　baron schwartz
　requirements for web databases
　how typical web databases grow
　the yearning for a cluster
　database strategy
　database tactics
　conclusion
13 how to make failure beautiful: the art and science of
postmortems
　jake loomis
　the worst postmortem
　what is a postmortem?
　when to conduct a postmortem
　who to invite to a postmortem
　running a postmortem
　postmortem follow-up
　conclusion
14 storage
　anoop nagwani
　data asset inventory
　data protection
　capacity planning
　storage sizing
　operations
　conclusion
15 nonrelational databases
　eric florenzano
　nosql database overview
　some systems in detail
　conclusion
16 agile infrastructure
　andrew clay sharer
　agile infrastructure
　so, what's the problem?
　communities of interest and practice
　trading zones and apologies
　conclusion
17 things that go bump in the night (and how to sleep through
them)
　mike christian
　definitions
　how many 9s?
　impact duration versus incident duration
　datacenter footprint
　gradual failures
　trust nobody
　failover testing
　monitoring and history of patterns
　getting a good night's sleep
contributors
index

章节摘录

版权页：插图：capacity planning needs, the daily resolution is fine. Adding higher resolution morethan once per day wouldn't change any of the results and would only increase theamount of time it would take to run reports or make it a pain to move the dataaround. Gathering these metrics once a day can be as simple as a nightly cron jobworking on a replicated slave database kept solely for crunching these numbers.Because we store these metrics in a database, being able to manipulate or correlatedata across different metrics is pretty straightforward, because the date is held constantacross metrics.For example, it might not be a surprise that during the holiday season, the average sizeof photo uploads increases significantly compared to the rest of the year, because of'the new digital cameras being given as gifts during that time. Because we have thosevalues, we can lay out others on the same dates. Then, it's not difficult to see howaverage upload size can increase disk space consumption （because the original sizes arelarger）, which can increase Flickr Pro subscriptions （because the limits are extended,compared to free accounts）.

图书封面

图书标签Tags

无

下载页面

网站运维 PDF格式下载

书印刷不错，全英文，喜欢运维的有必要看看。

由于工作中牵涉到相关的内容，购买了一本英文原版。首先从行文上来说，英语的书写非常流畅，语句也不错，看上去不是普通工程师之流可以撰写的。其次，作为一个知识的合集，从各个方面阐述了运营过程中会碰到的问题。有一些内容可能鉴于行业的关系，变得不那么适用（比如，银行业比较少会使用到联系发布）但大部分的内容，还是贴近一线的工作实际，收益颇多。

在书店中看到这本书，感觉不错。上亚马逊买了。影印版质量可以，书是好书呀

　　正在读这本书，这些大牛们，结合自身经验，讲网站运维的那些事娓娓道来，我们看到的书都是讲方法，而此书方法方面所述甚少，讲解的确都是心法。
　　确实我们所要了解的方法，大都通过google师兄可以学到，但心法却极少能有人说的清楚的。而总结成如此精华之物的真的是少之又少了。

　　互联网运维相关的书能上升到“道”这个层面的书很少，这个算一本。对数据采集测量，持续部署，监控，容灾，故障分析等各主题都有涉及。尤其当前互联网服务越来越庞大，开发和运维的职责已经无法明确切割。业务应用与监控，数据采集，部署，配置管理等系统都需要精密结合，最后才能组合出一个完整的系统。文中对面向服务体系结构的几点总结：
　　
　　应该是模块化的 “做一件事情，并且做好”
　　应该是协作的 “让我们成为一个村落” 每个服务都需要暴漏API供其他系统协作
　　应该是可组合的 “应该一切准备就绪” 等准备好了模块化服务后即可组合出更复杂的服务
　　值得开发以及运维人员一读。
　　
　　http://jolestar.com/recently-reading-notes/

第一图书网

网站运维

相关图书