Bug 1277810 - Part 1 - Restructure Telemetry client documentation. r=dexter, f=chutten

This commit is contained in:
Georg Fritzsche
2016-08-09 13:52:27 +02:00
parent 330852d2ea
commit fa5a8465ff
22 changed files with 92 additions and 42 deletions

View File

@@ -0,0 +1,23 @@
=======
Crashes
=======
There are many different kinds of crashes for Firefox, there is not a single system used to record all of them.
Main process crashes
====================
If the Firefox main process dies, that should be recorded as an aborted session. We would submit a :doc:`main ping <../data/main-ping>` with the reason ``aborted-session``.
If we have a crash dump for that crash, we should also submit a :doc:`crash ping <../data/crash-ping>`.
The ``aborted-session`` information is first written to disk 60 seconds after startup, any earlier crashes will not trigger an ``aborted-session`` ping.
Also, the ``aborted-session`` is updated at least every 5 minutes, so it may lag behind the last session state.
Crashes during startup should be recorded in the next sessions main ping in the ``STARTUP_CRASH_DETECTED`` histogram.
Child process crashes
=====================
If a Firefox plugin, content or gmplugin process dies unexpectedly, this is recorded in the main pings ``SUBPROCESS_ABNORMAL_ABORT`` keyed histogram.
If we catch a crash report for this, then additionally the ``SUBPROCESS_CRASHES_WITH_DUMP`` keyed histogram is incremented.

View File

@@ -0,0 +1,11 @@
========
Concepts
========
.. toctree::
:maxdepth: 2
:titlesonly:
:glob:
pings
crashes

View File

@@ -0,0 +1,71 @@
.. _telemetry_pings:
=====================
Telemetry pings
=====================
A *Telemetry ping* is the data that we send to Mozillas Telemetry servers.
That data is stored as a JSON object client-side and contains common information to all pings and a payload specific to a certain *ping types*.
The top-level structure is defined by the :doc:`../data/common-ping` format.
It contains some basic information shared between different ping types, the :doc:`../data/environment` data (optional) and the data specific to the *ping type*, the *payload*.
Submission
==========
*Note:* The server-side behaviour is documented in the `HTTP Edge Server specification <https://wiki.mozilla.org/CloudServices/DataPipeline/HTTPEdgeServerSpecification>`_.
Pings are submitted via a common API on ``TelemetryController``.
If a ping fails to successfully submit to the server immediately (e.g. because
of missing internet connection), Telemetry will store it on disk and retry to
send it until the maximum ping age is exceeded (14 days).
*Note:* the :doc:`main pings <../data/main-ping>` are kept locally even after successful submission to enable the HealthReport and SelfSupport features. They will be deleted after their retention period of 180 days.
Sending of pending pings starts as soon as the delayed startup is finished. They are sent in batches, newest-first, with up
to 10 persisted pings per batch plus all unpersisted pings.
The send logic then waits for each batch to complete.
If it succeeds we trigger the next send of a ping batch. This is delayed as needed to only trigger one batch send per minute.
If ping sending encounters an error that means retrying later, a backoff timeout behavior is
triggered, exponentially increasing the timeout for the next try from 1 minute up to a limit of 120 minutes.
Any new ping submissions and "idle-daily" events reset this behavior as a safety mechanism and trigger immediate ping sending.
The telemetry server team is working towards `the common services status codes <https://wiki.mozilla.org/CloudServices/DataPipeline/HTTPEdgeServerSpecification#Server_Responses>`_, but for now the following logic is sufficient for Telemetry:
* `2XX` - success, don't resubmit
* `4XX` - there was some problem with the request - the client should not try to resubmit as it would just receive the same response
* `5XX` - there was a server-side error, the client should try to resubmit later
Ping types
==========
We send Telemetry with different ping types. The :doc:`main <../data/main-ping>` ping is the ping that contains the bulk of the Telemetry measurements for Firefox. For more specific use-cases, we also send other custom ping types.
Examples are:
* :doc:`main <../data/main-ping>` - contains the information collected by Telemetry (Histograms, hang stacks, ...)
* :doc:`saved-session <../data/main-ping>` - has the same format as a main ping, but it contains the *"classic"* Telemetry payload with measurements covering the whole browser session. This is only a separate type to make storage of saved-session easier server-side. This is temporary and will be removed soon.
* :doc:`crash <../data/crash-ping>` - a ping that is captured and sent after Firefox crashes.
* ``activation`` - *planned* - sent right after installation or profile creation
* ``upgrade`` - *planned* - sent right after an upgrade
* :doc:`deletion <../data/deletion-ping>` - sent when FHR upload is disabled, requesting deletion of the data associated with this user
* :doc:`uitour <../data/uitour-ping>` - a ping submitted via the UITour API
* :doc:`heartbeat <../data/heartbeat-ping>` - contains information on Heartbeat surveys
* :doc:`sync <../data/sync-ping>` - sent after a sync is completed or fails, contains information on sync errors and performance.
Pings sent from code that ships with Firefox are listed in the :doc:`data documentation <../data/index>`.
Archiving
=========
When archiving is enabled through the relative preference, pings submitted to ``TelemetryController`` are also stored locally in the user profile directory, in `<profile-dir>/datareporting/archived`.
To allow for cheaper lookup of archived pings, storage follows a specific naming scheme for both the directory and the ping file name: `<YYYY-MM>/<timestamp>.<UUID>.<type>.json`.
* ``<YYYY-MM>`` - The subdirectory name, generated from the ping creation date.
* ``<timestamp>`` - Timestamp of the ping creation date.
* ``<UUID>`` - The ping identifier.
* ``<type>`` - The ping type.