Stability: Smarter Monitoring Application Crashes

I had posted about the way Tumblr uses time-series monitoring to alert on crash spikes in the Android and iOS applications. Since then, we’ve done a lot of work to reduce the overall volume of crashes. As a result, we created a new problem: it was possible for a handful of people, caught in crash cycles, to cause our stability alerts to trigger.

Once the stability alert is triggered, we typically start looking in the crash logging systems, like Crashlytics or Sentry, to find more information about the crash. We found an increasing number of occurrences where no particular crash could be easily identified as causing the spike.

Getting paged at 2am because of a stability alert is not great, but not finding a crash was even worse.

The problem was the way we were monitoring. Simply watching all crash events wasn’t good enough. We had to start normalizing the events across the users. Thankfully, we collect events and not simple ticks. We have rich data in the crash event, including a way to group events coming from the same device.

Here is an example of the two styles of monitoring over the last week:

raw count shows a large spike, but the unique device count shows a normal trend

That after-midnight Raw Count spike on Friday would have paged our on-call person if we hadn’t changed to alert on the Unique Device Count instead. We still use the Raw Counts to identify issues and investigate, but we don’t alert on them. We can use the high-cardinality events to zero-in on the cause of the spike. In this case, two (2) people were having a bad experience using their Cubot Echo devices.

Since moving to the new alerting metric, we’ve had far fewer after-hour pages, while still being able to focus on the stability of the applications across our user base.

Performance: The Merits of Measuring

If you can not measure it, you can not improve it. – Lord Kelvin.

There is so much information out there on ways to improve the performance of your mobile application or website. You probably feel you can just dive in and start making changes. But if you’re not measuring your application’s performance, you don’t know if anything is really helping or hurting. How do you know what effect any changes will have on the performance? Most applications are complex enough that we can’t assume our simplistic reasoning accurately reflects the code behavior. You need to measure.

You need measurements from before and after any changes are made. Your application has development phases, so should your measurement plan. Measure in CI to find improvements and regressions as soon as they happen. Measure in the real world to find how variations like network conditions, device fragmentation, and unpredictable user behavior manifests in performance.

Measuring in CI

The point of measuring performance in CI is to control variability and watch for relative differences on each change. Try to reduce the noisy variables like network calls and background services to create a fast and consistent surface on which you can monitor performance changes in a reliable and repeatable manner. The purpose is not to determine the performance your users will encounter. There are too many variables in the real world and you can’t control them well enough for apples-to-apples comparisons.

Use real devices when measuring performance, not emulators or simulators running on host hardware.

You can try to create reliable & repeatable simulations of some real world situations. Network connection speed is one example. You can use a network simulator, like Facebook’s Augmented Traffic Control system to simulate WiFi and mobile network conditions. This is especially useful if your application is designed to react differently under different network conditions. You can also use different types of content in the tests, trying to mimic some high level differences your users might encounter.

If you’re measuring data in CI, you should be storing and displaying it as well. Try to get the CI to alert on regressions, failing changes before they make it into the product.

Some common things to measure in CI:

  • Launch time to show UI
  • Launch time to interactive UI
  • Scroll performance (janky frames)
  • Time to load content
  • Memory usage (startup, after content loads, after scrolling content)

Remember to use multiple (physical) devices and a variety of content types.

Measuring in Real World

While CI measurements come from only a handful of tests & situations, the real world has many, many more situations. Depending on the number of active users, you could have millions of data points with thousands of unique situations. Collecting data from real users, at a large scale, allows you to investigate how things like global regions, network conditions — and even user types — can affect the performance of the application.

The are many third-party systems you can integrate into your code to easily and efficiently collect real world data. It’s not uncommon for companies to grow their own systems as well. In any case, make sure you are validating the data itself. Since real world data is messy, make sure you are vetting the collection systems and the data. Look for problems like payload corruption, clock skew, range errors or other oddities.

Create automated queries and reports, sent out broadly for people to review. Remember to go deeper than high-level summaries. Some of the interesting discoveries happen when you split out data across different dimensions.

Some common things to measure from real world:

  • Network usage, including start time, end time, content type and size of the response. Get detailed connection timing, if possible, for DNS and SSL handshake information.
    • For API endpoints, this is useful for tracking latency and payload size
    • For media loading, this gives a ballpark metric for how long people are staring at an empty box, waiting for an image to load.
  • Event, session and error state data. This can be used to track critical content impressions, but also can be used to learn how people use the application.

Remember to include some common metadata in each measurement so you can split out the data across different dimensions. Things like non-PII identifier, generic geo-location/region, device specs and connection type/speed can help you drill down into the data, looking for trends.

It’s also polite to allow people to opt-out of this type of data collection.

Fun with Telemetry: Improving Our User Analytics Story

My last post talks about the initial work to create a real user analytics system based on the UI Telemetry event data collected in Firefox on Mobile. I’m happy to report that we’ve had much forward progress since then. Most importantly, we are no longer using the DIY setup on one of my Mac Minis. Working with the Mozilla Telemetry & Data team, we have a system that extracts data from UI Telemetry via Spark, imports the data into Presto-based storage, and allows SQL queries and visualization via Re:dash.

With data accessible via Re:dash, we can use SQL to focus on improving our analyses:

  • Track Active users, daily & monthly
  • Explore retention & churn
  • Look into which features lead to retention
  • Calculate user session length & event counts per session
  • Use funnel analysis to evaluate A/B experiments

loadurl-types

loadurl-retention-effect

dropoff-rate

Roberto posted about how we’re using Parquet, Presto and Re:dash to create an SQL based query and visualization system.

Fun with Telemetry: DIY User Analytics Lab in SQL

Firefox on Mobile has a system to collect telemetry data from user interactions. We created a simple event and session UI telemetry system, built on top of the core telemetry system. The core telemetry system has been mainly focused on performance and stability. The UI telemetry system is really focused on how people are interacting with the application itself.

Event-based data streams are commonly used to do user data analytics. We’re pretty fortunate to have streams of events coming from all of our distribution channels. I wanted to start doing different types of analyses on our data, but first I needed to build a simple system to get the data into a suitable format for hacking.

One of the best one-stop sources for a variety of user analytics is the Periscope Data blog. There are posts on active users, retention and churn, and lots of other cool stuff. The blog provides tons of SQL examples. If I could get the Firefox data into SQL, I’d be in a nice place.

Collecting Data

My first step is performing a little ETL (well, the E & T parts) on the raw data using Spark/Python framework for Mozilla Telemetry. I wanted to create two dataset:

  • clients: Dataset of the unique clients (users) tracked in the system. Besides containing the unique clientId, I wanted to store some metadata, like the profile creation date. (script)
  • events: Dataset of the event stream, associated to each client. The event data also has information about active A/B experiments. (script)

Building a Database

I installed Postgres on a Mac Mini (powerful stuff, I know) and created my database tables. I was periodically collecting the data via my Spark scripts and I couldn’t guarantee I wouldn’t re-collect data from the previous jobs. I couldn’t just bulk insert the data. I wrote some simple Python scripts to quickly import the data (clients & events), making sure not to create any duplicates.

fennec-telemetry-data

I decided to start with 30 days of data from our Nightly and Beta channels. Nightly was relatively small (~330K rows of events), but Beta was more significant (~18M rows of events).

Analyzing and Visualizing

Now that I had my data, I could start exploring. There are a lot of analysis/visualization/sharing tools out there. Many are commercial and have lots of features. I stumbled across a few open-source tools:

  • Airpal: A web-based query execution tool from Airbnb. Makes it easy to save and share SQL analysis queries. Works with Facebook’s PrestoDB, but doesn’t seem to create any plots.
  • Re:dash: A web-based query, visualization and collaboration tool. It has tons of visualization support. You can set it up on your own server, but it was a little more than I wanted to take on over a weekend.
  • SQLPad: A web-based query and visualization tool. Simple and easy to setup, so I tried using it.

Even though I wanted to use SQLPad as much as possible, I found myself spending most of my time in pgAdmin. Debugging queries, using EXPLAIN to make queries faster, and setting up indexes. It was easier in pgAdmin. Once I got the basic things figured out, I was able to more efficiently use SQLPad. Below are some screenshots using the Nightly data:

sqlpad-query

sqlpad-chart

Next Steps

Now that I have Firefox event data in SQL, I can start looking at retention, churn, active users, engagement and funnel analysis. Eventually, we want this process to be automated, data stored in Redshift (like a lot of other Mozilla data) and exposed via easy query/visualization/collaboration tools. We’re working with the Mozilla Telemetry & Data Pipeline teams to make that happen.

A big thanks to Roberto Vitillo and Mark Reid for the help in creating the Spark scripts, and Richard Newman for double-dog daring me to try this.

Firefox on Mobile: A/B Testing and Staged Rollouts

We have decided to start running A/B Testing in Firefox for Android. These experiments are intended to optimize specific outcomes, as well as, inform our long-term design decisions. We want to create the best Firefox experience we can, and these experiments will help.

The system will also allow us to throttle the release of features, called staged rollout or feature toggles, so we can monitor new features in a controlled manner across a large user base and a fragmented device ecosystem. If we need to rollback a feature for some reason, we’d have the ability to do that, quickly without needing people to update software.

Technical details:

  • Mozilla Switchboard is used to control experiment segmenting and staged rollout.
  • UI Telemetry is used to collect metrics about an experiment.
  • Unified Telemetry is used to track active experiments so we can correlate to application usage.

What is Mozilla Switchboard?

Mozilla Switchboard is based on Switchboard, an open source SDK for doing A/B testing and staged rollouts from the folks at KeepSafe. It connects to a server component, which maintains a list of active experiments.

The SDK does create a UUID, which is stored on the device. The UUID is sent to the server, which uses it to “bucket” the client, but the UUID is never stored on the server. In fact, the server does not store any data. The server we are using was ported to Node from PHP and is being hosted by Mozilla.

We decided to start using Switchboard because it’s simple, open source, has client code for Android and iOS, saves no data on the server and can be hosted by Mozilla.

Planning Experiments

The Mobile Product and UX teams are the primary drivers for creating experiments, but as is common on the Mobile team, ideas can come from anywhere. We have been working with the Mozilla Growth team, getting a better understanding of how to design the experiments and analyze the metrics. UX researchers also have input into the experiments.

Once Product and UX complete the experiment design, Development would land code in Firefox to implement the desired variations of the experiment. Development would also land code in the Switchboard server to control the configuration of the experiment: On what channels is it active? How are the variations distributed across the user population?

Since we use Telemetry to collect metrics on the experiments, the Beta channel is likely our best time period to run experiments. Telemetry is on by default on Nightly, Aurora and Beta; and Beta is the largest user base of those three channels.

Once we decide which variation of the experiment is the “winner”, we’ll change the Switchboard server configuration for the experiment so that 100% of the user base will flow through the winning variation.

Yes, a small percentage of the Release channel has Telemetry enabled, but it might be too small to be useful for experimentation. Time will tell.

What’s Happening Now?

We are trying to be very transparent about active experiments and staged rollouts. We have a few active experiments right now.

  • Onboarding A/B experiment with several variants.
  • Easy entry points for accessing History and Bookmarks on the main menu.
  • Experimenting with the awesomescreen behavior when displaying search results page.

You can always look at the Mozilla Switchboard configuration to see what’s happening. Over time, we’ll be adding support to Firefox for iOS as well.

An Engineer’s Guide to App Metrics

Building and shipping a successful product takes more than raw engineering. I have been posting a bit about using Telemetry to learn about how people interact with your application so you can optimize use cases. There are other types of data you should consider too. Being aware of these metrics can help provide a better focus for your work and, hopefully, have a bigger impact on the success of your product.

Active Users

This includes daily active users (DAUs) and monthly active users (MAUs). How many people are actively using the product within a time-span? At Mozilla, we’ve been using these for a long time. From what I’ve read, these metrics seem less important when compared to some of the other metrics, but they do provide a somewhat easy to measure indicator of activity.

These metrics don’t give a good indication of how much people use the product though. I have seen a variation metric called DAU/MAU (daily divided by monthly) and gives something like retention or engagement. DAU/MAU rates of 50% are seen as very good.

Engagement

This metric focuses on how much people really use the product, typically tracking the duration of session length or time spent using the application. The amount of time people spend in the product is an indication of stickiness. Engagement can also help increase retention. Mozilla collects data on session length now, but we need to start associating metrics like this with some of our experiments to see if certain features improve stickiness and keep people using the application.

We look for differences across various facets like locales and releases, and hopefully soon, across A/B experiments.

Retention / Churn

Based on what I’ve seen, this is the most important category of metrics. There are variations in how these metrics can be defined, but they cover the same goal: Keep users coming back to use your product. Again, looking across facets, like locales, can provide deeper insight.

Rolling Retention: % of new users return in the next day, week, month
Fixed Retention: % of this week’s new users still engaged with the product over successive weeks.
Churn: % of users who leave divided by the number of total users

Most analysis tools, like iTunes Connect and Google Analytics, use Fixed Retention. Mozilla uses Fixed Retention with our internal tools.

I found some nominal guidance (grain of salt required):
1-week churn: 80% bad, 40% good, 20% phenomenal
1-week retention: 25% baseline, 45% good, 65% great

Cost per Install (CPI)

I have also seen this called Customer Acquisition Cost (CAC), but it’s basically the cost (mostly marketing or pay-to-play pre-installs) of getting a person to install a product. I have seen this in two forms: blended – where ‘installs’ are both organic and from campaigns, and paid – where ‘installs’ are only those that come from campaigns. It seems like paid CPI is the better metric.

Lower CPI is better and Mozilla has been using Adjust with various ad networks and marketing campaigns to figure out the right channel and the right messaging to get Firefox the most installs for the lowest cost.

Lifetime Value (LTV)

I’ve seen this defined as the total value of a customer over the life of that customer’s relationship with the company. It helps determine the long-term value of the customer and can help provide a target for reasonable CPI. It’s weird thinking of “customers” and “value” when talking about people who use Firefox, but we do spend money developing and marketing Firefox. We also get revenue, maybe indirectly, from those people.

LTV works hand-in-hand with churn, since the length of the relationship is inversely proportional to the churn. The longer we keep a person using Firefox, the higher the LTV. If CPI is higher than LTV, we are losing money on user acquisition efforts.

Total Addressable Market (TAM)

We use this metric to describe the size of a potential opportunity. Obviously, the bigger the TAM, the better. For example, we feel the TAM (People with kids that use Android tablets) for Family Friendly Browsing is large enough to justify doing the work to ship the feature.

Net Promoter Score (NPS)

We have seen this come up in some surveys and user research. It’s suppose to show how satisfied your customers are with your product. This metric has it’s detractors though. Many people consider it a poor value, but it’s still used quiet a lot.

NPS can be as low as -100 (everybody is a detractor) or as high as +100 (everybody is a promoter). An NPS that is positive (higher than zero) is felt to be good, and an NPS of +50 is excellent.

Go Forth!

If you don’t track any of these metrics for your applications, you should. There are a lot of off-the-shelf tools to help get you started. Level-up your engineering game and make a bigger impact on the success of your application at the same time.

Fun With Telemetry: URL Suggestions

Firefox for Android has a UI Telemetry system. Here is an example of one of the ways we use it.

As you type a URL into Firefox for Android, matches from your browsing history are shown. We also display search suggestions from the default search provider. We also recently added support for displaying matches to previously entered search history. If any of these are tapped, with one exception, the term is used to load a search results page via the default search provider. If the term looks like a domain or URL, Firefox skips the search results page and loads the URL directly.

fennec-suggestions-annotated

  1. This suggestion is not really a suggestion. It’s what you have typed. Tagged as user.
  2. This is a suggestion from the search engine. There can be several search suggestions returned and displayed. Tagged as engine.#
  3. This is a special search engine suggestion. It matches a domain, and if tapped, Firefox loads the URL directly. No search results page. Tagged as url
  4. This is a matching search term from your search history. There can be several search history suggestions returned and displayed. Tagged as history.#

Since we only recently added the support for search history, we want to look at how it’s being used. Below is a filtered view of the URL suggestion section of our UI Telemetry dashboard. Looks like history.# is starting to get some usage, and following a similar trend to engine.# where the first suggestion returned is used more than the subsequent items.

Also worth pointing out that we do get a non-trivial amount of url situations. This should be expected. Most search keyword data released by Google show that navigational keywords are the most heavily used keywords.

An interesting observation is how often people use the user suggestion. Remember, this is not actually a suggestion. It’s what the person has already typed. Pressing “Enter” or “Go” would result in the same outcome. One theory for the high usage of that suggestion is it provides a clear outcome: Firefox will search for this term. Other ways of trigger the search might be more ambiguous.

telemetry-suggestions

Firefox for Android: Collecting and Using Telemetry

Firefox 31 for Android is the first release where we collect telemetry data on user interactions. We created a simple “event” and “session” system, built on top of the current telemetry system that has been shipping in Firefox for many releases. The existing telemetry system is focused more on the platform features and tracking how various components are behaving in the wild. The new system is really focused on how people are interacting with the application itself.

Collecting Data

The basic system consists of two types of telemetry probes:

  • Events: A telemetry probe triggered when the users takes an action. Examples include tapping a menu, loading a URL, sharing content or saving content for later. An Event is also tagged with a Method (how was the Event triggered) and an optional Extra tag (extra context for the Event).
  • Sessions: A telemetry probe triggered when the application starts a short-lived scope or situation. Examples include showing a Home panel, opening the awesomebar or starting a reading viewer. Each Event is stamped with zero or more Sessions that were active when the Event was triggered.

We add the probes into any part of the application that we want to study, which is most of the application.

Visualizing Data

The raw telemetry data is processed into summaries, one for Events and one for Sessions. In order to visualize the telemetry data, we created a simple dashboard (source code). It’s built using a great little library called PivotTable.js, which makes it easy to slice and dice the summary data. The dashboard has several predefined tables so you can start digging into various aspects of the data quickly. You can drag and drop the fields into the column or row headers to reorganize the table. You can also add filters to any of the fields, even those not used in the row/column headers. It’s a pretty slick library.

uitelemetry-screenshot-crop

Acting on Data

Now that we are collecting and studying the data, the goal is to find patterns that are unexpected or might warrant a closer inspection. Here are a few of the discoveries:

Page Reload: Even in our Nightly channel, people seem to be reloading the page quite a bit. Way more than we expected. It’s one of the Top 2 actions. Our current thinking includes several possibilities:

  1. Page gets stuck during a load and a Reload gets it going again
  2. Networking error of some kind, with a “Try again” button on the page. If the button does not solve the problem, a Reload might be attempted.
  3. Weather or some other update-able page where a Reload show the current information.

We have started projects to explore the first two issues. The third issue might be fine as-is, or maybe we could add a feature to make updating pages easier? You can still see high uses of Reload (reload) on the dashboard.

Remove from Home Pages: The History, primarily, and Top Sites pages see high uses of Remove (home_remove) to delete browsing information from the Home pages. People do this a lot, again it’s one of the Top 2 actions. People will do this repeatably, over and over as well, clearing the entire list in a manual fashion. Firefox has a Clear History feature, but it must not be very discoverable. We also see people asking for easier ways of clearing history in our feedback too, but it wasn’t until we saw the telemetry data for us to understand how badly this was needed. This led us to add some features:

  1. Since the History page was the predominant source of the Removes, we added a Clear History button right on the page itself.
  2. We added a way to Clear History when quitting the application. This was a bit tricky since Android doesn’t really promote “Quitting” applications, but if a person wants to enable this feature, we add a Quit menu item to make the action explicit and in their control.
  3. With so many people wanting to clear their browsing history, we assumed they didn’t know that Private Browsing existed. No history is saved when using Private Browsing, so we’re adding some contextual hinting about the feature.

These features are included in Nightly and Aurora versions of Firefox. Telemetry is showing a marked decrease in Remove usage, which is great. We hope to see the trend continue into Beta next week.

External URLs: People open a lot of URLs from external applications, like Twitter, into Firefox. This wasn’t totally unexpected, it’s a common pattern on Android, but the degree to which it happened versus opening the browser directly was somewhat unexpected. Close to 50% of the URLs loaded into Firefox are from external applications. Less so in Nightly, Aurora and Beta, but even those channels are almost 30%. We have started looking into ideas for making the process of opening URLs into Firefox a better experience.

Saving Images: An unexpected discovery was how often people save images from web content (web_save_image). We haven’t spent much time considering this one. We think we are doing the “right thing” with the images as far as Android conventions are concerned, but there might be new features waiting to be implemented here as well.

Take a look at the data. What patterns do you see?

Here is the obligatory UI heatmap, also available from the dashboard:
uitelemetry-heatmap

Zippity Update – Telemetry

I updated Zippity, our crowd sourcing data collection system, to add support for memory and CPU data telemetry. What is memory and CPU data telemetry you ask? Basically, we send data about the current Firefox Mobile memory and CPU usage patterns to Zippity. There are two ways to send the data:

  • Manually push a button: This might be a handy way to report metrics when you think Firefox Mobile is running slow on your phone.
  • Silently on idle once a day: During an idle moment, Zippity scans the metrics. This gives us data on the Firefox Mobile resting state. Hopefully we don’t see large CPU usage. Just install the Zippity add-on, and you start sending data – no additional work for you!

Why send this data you ask? We care a lot of the performance of Firefox Mobile, and we want to better understand how Firefox Mobile is running on your phone. We run a lot of tests at Mozilla, but these tests sometimes don’t compare well to the real world. For example, I have no performance issues running Firefox Mobile on my Nexus One, but we get lots of feedback from users that the performance is too slow on their phones. We need to run tests and collect data from Firefox Mobile running on your phone.

What kind of data is sent you ask? Firefox Mobile has a built-in memory reporting system and Zippity will enumerate that system to get detailed memory usage data. We also use the Linux /proc system to get information about the CPU usage. We only check the Firefox main and child processes. We don’t log any information about other processes. Thanks to Brad Lassey and Taras Glek for helping me get the data collected. Here is an example of the JSON formatted data sent to Zippity:

{
    "product": "Fennec 6.0a1",
    "os": "Android",
    "buildid": "20110416004121",
    "type": "metrics",
    "date": 1303152081000,
    "testkey": "telemetry-v1",
    "appid": "{a23983c0-fd0e-11dc-95ff-0800200c9a66}",
    "device": "Nexus One",
    "userkey": "",
    "osversion": "REL (10)",
    "data": {
        "reason": "manual",
        "malloc/allocated": 32377108,
        "malloc/mapped": 34603008,
        "malloc/committed": 34603008,
        "malloc/dirty": 77824,
        "js/gc-heap": 5242880,
        "js/string-data": 514630,
        "js/mjit-code": 88500,
        "storage/sqlite/pagecache": 4536536,
        "storage/sqlite/other": 1050816,
        "storage/permissions.sqlite/LookAside_Used": 29,
        "storage/permissions.sqlite/Cache_Used": 99208,
        "storage/permissions.sqlite/Schema_Used": 1272,
        "storage/permissions.sqlite/Stmt_Used": 5744,
        "storage/extensions.sqlite/LookAside_Used": 438,
        "storage/extensions.sqlite/Cache_Used": 428248,
        "storage/extensions.sqlite/Schema_Used": 6928,
        "storage/extensions.sqlite/Stmt_Used": 119024,
        "gfx/surface/image": 1088248,
        "images/chrome/used/raw": 0,
        "images/chrome/used/uncompressed": 1025124,
        "images/chrome/unused/raw": 0,
        "images/chrome/unused/uncompressed": 3648,
        "images/content/used/raw": 0,
        "images/content/used/uncompressed": 2144,
        "images/content/unused/raw": 4168,
        "images/content/unused/uncompressed": 8288,
        "layout/all": 563338,
        "storage/places.sqlite/LookAside_Used": 143,
        "storage/places.sqlite/Cache_Used": 428488,
        "storage/places.sqlite/Schema_Used": 11704,
        "storage/places.sqlite/Stmt_Used": 44152,
        "content/canvas/2d_pixel_bytes": 81120,
        "storage/cookies.sqlite/LookAside_Used": 14,
        "storage/cookies.sqlite/Cache_Used": 165000,
        "storage/cookies.sqlite/Schema_Used": 1816,
        "storage/cookies.sqlite/Stmt_Used": 0,
        "storage/formhistory.sqlite/LookAside_Used": 14,
        "storage/formhistory.sqlite/Cache_Used": 197920,
        "storage/formhistory.sqlite/Schema_Used": 1656,
        "storage/formhistory.sqlite/Stmt_Used": 0,
        "storage/addons.sqlite/LookAside_Used": 152,
        "storage/addons.sqlite/Cache_Used": 296616,
        "storage/addons.sqlite/Schema_Used": 4280,
        "storage/addons.sqlite/Stmt_Used": 29152,
        "storage/search.sqlite/LookAside_Used": 20,
        "storage/search.sqlite/Cache_Used": 99192,
        "storage/search.sqlite/Schema_Used": 1216,
        "storage/search.sqlite/Stmt_Used": 1840,
        "shmem/allocated": 0,
        "shmem/mapped": 0,
        "storage/index.sqlite/LookAside_Used": 148,
        "storage/index.sqlite/Cache_Used": 296656,
        "storage/index.sqlite/Schema_Used": 3032,
        "storage/index.sqlite/Stmt_Used": 51704,
        "uptime": 159848,
        "process": {
            "parent": {
                "user": 13,
                "system": 2,
                "rss": 77456
            },
            "child": {
                "user": 22,
                "system": 2,
                "rss": 25132
            },
            "cpu": 57
        }
    }
}

You can now access Metrics charts on Zippity to view the data. Trying to find ways to visualize the data isn’t easy. There is a lot of data. The charts are simple for now. The data point tooltip shows more detailed information.

Visit Zippity from your phone and install the add-on today. Thanks for using Zippity to help us collect data to make Firefox Mobile better.

Firefox 3 – Offline App Demo

Firefox 3’s offline capabilities have been getting some attention lately: Chris Double’s post on porting Zimbra to use offline shows-off the potential. Robert O’Callahan and John Resig give details on the different pieces of the capabilities, namely: Offline Cache, Offline Events and DOMStorage.

A Simple Demo – Task Helper

I put together a simple offline application sample to help illustrate how the features can work. Call it – explanation through code. The application is a simple task list system. The current functionality includes:

  • Add tasks – Enter text in field and press ‘Add’
  • Complete tasks – Mark completed tasks and press ‘Complete’
  • Remove tasks – Mark tasks and press ‘Remove’
  • Data is stored as JSON and XHR is used to interact with server (PHP)

taskhelper.png

Nothing fancy. However, the application is “offline-aware”, meaning:

  • Application resources (HTML & JS) are tagged as offline cacheable
  • Before interacting with server, online status is checked. If online, use XHR. If offline, use DOMStorage
  • Online/Offline status is monitored using events. When the application switches form offline to online, data is resync’ed with the server

There is really nothing fancy about the offline stuff either, but putting it all together does make for a neat application. Using the latest Firefox 3 alpha, you should be able to:

  • Use the application while online.
  • Go offline using the “Work Offline” menu. The event log should show that the app went offline
  • Continue to use the application while offline
  • Go online using the “Work Offline” menu. The event log should show that the app is back online and has updated the server with any offline changes

If you were using a version of Firefox 3 with the offline cache patch (bug 367447) you would be able to a little more:

  • Use the application while online.
  • Go offline using the “Work Offline” menu. The event log should show that the app went offline
  • Continue to use the application while offline
  • Restart browser
  • Switch to offline (or have no network connection)
  • Enter URL to app (or use a bookmark)
  • Start using app with data from last session
  • Switching to online will update the server with offline data

Currently, the offline events only fire when offline is toggled via the “Work Offline” menu. Getting the events to fire by watching the network connection is the goal and would make this much more unobtrusive for the user.

Here is the source to the task helper application: taskhelper-src.zip