If the user does not actually support WebGL 2, attempting to create a DrawTargetWebgl will always fail.
However, each time it goes to create a DrawTargetWebgl, it will try to initialize a WebGL 2 context again.
Each time it does this, if the user is using GLX, it will call XCreatePixmap and glXCreateContextAttribs,
which can run afoul of a libX11 race condition. This will happen on every page load where a new canvas is
encountered.
It would be better to note that the failure is non-recoverable and not try to use DrawTargetWebgl on
subsequent attempts. Here we set the sContextInitError value to indicate there are non-recoverable
errors, after which it will bypass attempting creation the next time.
This means the worst exposure the user will have to the libX11 race condition is once per process lifetime,
which should greatly reduce the incidence.
Differential Revision: https://phabricator.services.mozilla.com/D171016
If we choose to accelerate a single line path, we need to take care not to use
the line cap when the path is closed. When the path is closed, we need to use
the line join instead.
Differential Revision: https://phabricator.services.mozilla.com/D170469
Rectangle with large floating-point values can start to lose mantissa bits when transformed directly
by the vertex shader, pushed through the viewport transform, and then clipped later. OTOH, the path
fallback (WGR/AA-Stroke or Skia) does not stress floating-point precision as much, so generally
returns more correct results in these cases. This becomes noticeable when the same rectangle is
filled and then subsequently stroked, since the fill may use the rect shader, while the the stroke
will fall back to using a path.
To avoid hitting this issue, this checks if a rect's coordinate are outside as certain reasonable
limit, and if so, falls back to relying on path geometry to handle transform and clipping safely.
If within the limit, the shader precision loss doesn't noticeably impact the results so it is still
safe to use the fast-path.
Differential Revision: https://phabricator.services.mozilla.com/D170270
Skia upstream removed deprecated clip ops that could be used to replace
the clipping stack and bypass clips. We shouldn't really need to do this
anymore, as we can work around it just using public APIs.
The only SkCanvas operation that allows us to bypass clipping is
writePixels, which still allows us to implement CopySurface/putImageData.
Other instances where we were using the replace op for DrawTargetWebgl
layering support can just be worked around by creating a separate
DrawTargetSkia pointing to the same pixel data, but on which no clipping
or transforms are applied so that we can freely do drawing operations
on it to the base layer pixel data regardless of any user-applied clipping.
Differential Revision: https://phabricator.services.mozilla.com/D168039
This updates the version wpf-gpu-raster which adds support for
GPUs/drivers that use truncation instead of rounding when converting
vertices to fixed point.
It also adds the GL vendor to InitContextResult so that we can detect
AMD on macOS and tell wpf-gpu-raster that truncation is going to happen.
Differential Revision: https://phabricator.services.mozilla.com/D167503
This adds a debug indicator controlled by the pref gfx.canvas.accelerated.debug.
A green square is drawn in the upper right corner of the canvas to let us know if
acceleration is being used or not.
Differential Revision: https://phabricator.services.mozilla.com/D165018
CanvasRenderingContext2D relies upon CreateSimilarDrawTarget to create extract
a subrect from a surface to draw. However, DrawTargetWebgl does not return an
accelerated DT for that API as creating an entirely new context can be quite
expensive.
To work around this, this adds a specific ExtractSubrect API for SourceSurface
that can bypass the entire need to create a temporary DrawTarget to copy into.
Differential Revision: https://phabricator.services.mozilla.com/D164118
This pre-allocates a vertex output buffer in DrawTargetWebgl so that we can generate
wpf-gpu-raster and aa-stroke output into it. This way, they don't have to realloc
a Vec for pushes or changing into a boxed slice. This can net 5-10% on profiles for
the demos noted in the bug.
Depends on D163989
Differential Revision: https://phabricator.services.mozilla.com/D163990
This increases the amount of quantization applied to path cache entries
to 0.25 increments (2 bits of subpixel precision), leading to 4x4=16
subpixel buckets in total along both axes.
Differential Revision: https://phabricator.services.mozilla.com/D163749
This fixes an incidental bug with the pref to turn off GPU stroking. It's not supposed to disable
caching strokes as textures. This allows that to work again even if prefed off.
Differential Revision: https://phabricator.services.mozilla.com/D163307
It seems like this is slow for now until we implement a better way than WPF-gpu-raster
for stroking paths. Just hide this behind a pref so we can at least test it but not
impact performance as badly.
Differential Revision: https://phabricator.services.mozilla.com/D163248
For use-cases that repeatedly pop and re-push the same clips over and over, we can regenerate the
same mask that is already still stored, because we only detect that clip state changed, rather than
that it changed to exactly the same state it was previously.
This just remembers the previous state of the clip stack at the time the clip mask was generated
so that we can compare the previous and current state. If they're the same, we can assume there
is no need to regenerate the clip mask again and simply reuse it.
Differential Revision: https://phabricator.services.mozilla.com/D162699
WebGL doesn't reliably implement line smoothing, so we can't rely on it, making it
useless for canvas lines. Instead, just fall back to emulating it manually with paths.
Differential Revision: https://phabricator.services.mozilla.com/D162540
Some paths may contain so many types that their vertex representation far exceeds their
software rasterized representation in memory size. As a sanity-check, we should just set
a hard limit on the maximum allowed complexity of a path that we attempt to supply to
wpf-gpu-raster. Beyond that, we will instead just rasterize in software and upload
to a texture which can be more performant.
Differential Revision: https://phabricator.services.mozilla.com/D162481
By default, BorrowSnapshot is pessimistic and forces DrawTargetWebgl to return a data snapshot on
the assumption that the snapshot might be used off thread. However, if we actually know the DrawTarget
we're going to be drawing the snapshot to, then we can check if they're both DrawTargetWebgls with
the same internal SharedContext. In that case, we can use a SourceSurfaceWebgl snapshot which can
pass through a GPU texture to the target. This requires us to plumb the DrawTarget down through
SurfaceFromElement all the way to DrawTargetWebgl to make this decision.
Differential Revision: https://phabricator.services.mozilla.com/D162176
This adds a path vertex buffer where triangle list output from WGR is stored.
Each PathCacheEntry can potentially reference a range of vertexes in this buffer
corresponding to triangles for that entry. When this buffer is full, it gets
orphaned and clears corresponding cache entries, so that it can start anew.
Differential Revision: https://phabricator.services.mozilla.com/D161479
This adds a path vertex buffer where triangle list output from WGR is stored.
Each PathCacheEntry can potentially reference a range of vertexes in this buffer
corresponding to triangles for that entry. When this buffer is full, it gets
orphaned and clears corresponding cache entries, so that it can start anew.
Differential Revision: https://phabricator.services.mozilla.com/D161479
[Int]CoordTyped no longer inherits Units because otherwise
instances of [Int]IntPointTyped may get one Base subobject because
it inherits Units, and others because of BasePoint's Coord members,
which end up increasing the [Int]CoordTyped's objects size (since
according to the ISO C++ standard, different Base subobject are
required to have different addresses).
Differential Revision: https://phabricator.services.mozilla.com/D160713
If we have stroked paths whose bounds cover a lot of screen area, that can lead
to a lot of empty area in the interior that bloats the path cache textures up
with unused pixels that still need to be uploaded. Try to avoid this by not
trying to accelerate paths with the path cache that take up a large amount
of screen area.
Differential Revision: https://phabricator.services.mozilla.com/D160023
For canvas users that rapidly create and destroy canvases, we may end up creating
a new SharedContext (and hence ClientWebGLContext) if there are no more canvases
left between destruction and creation. To work around this, just keep alive the
SharedContext for the main thread (other threads are unfortunately a bit tricky
to support) so that canvas creation remains fast in this instance.
Differential Revision: https://phabricator.services.mozilla.com/D158904
If we fail to compile DrawTargetWebgl's shaders, we bail out to a normal software canvas.
However, it will still try to create a DrawTargetWebgl every time we need to create a canvas.
To avoid this, remember if shader compilation failed in the process, and don't try to create
an accelerated canvas again in that case.
Differential Revision: https://phabricator.services.mozilla.com/D158903
For canvas users that rapidly create and destroy canvases, we may end up creating
a new SharedContext (and hence ClientWebGLContext) if there are no more canvases
left between destruction and creation. To work around this, just keep alive the
SharedContext for the main thread (other threads are unfortunately a bit tricky
to support) so that canvas creation remains fast in this instance.
Differential Revision: https://phabricator.services.mozilla.com/D158904
If we fail to compile DrawTargetWebgl's shaders, we bail out to a normal software canvas.
However, it will still try to create a DrawTargetWebgl every time we need to create a canvas.
To avoid this, remember if shader compilation failed in the process, and don't try to create
an accelerated canvas again in that case.
Differential Revision: https://phabricator.services.mozilla.com/D158903
For canvas users that rapidly create and destroy canvases, we may end up creating
a new SharedContext (and hence ClientWebGLContext) if there are no more canvases
left between destruction and creation. To work around this, just keep alive the
SharedContext for the main thread (other threads are unfortunately a bit tricky
to support) so that canvas creation remains fast in this instance.
Depends on D158903
Differential Revision: https://phabricator.services.mozilla.com/D158904
If we fail to compile DrawTargetWebgl's shaders, we bail out to a normal software canvas.
However, it will still try to create a DrawTargetWebgl every time we need to create a canvas.
To avoid this, remember if shader compilation failed in the process, and don't try to create
an accelerated canvas again in that case.
Differential Revision: https://phabricator.services.mozilla.com/D158903
Previously we were reusing the framebuffer's Skia DT to render the clip mask.
This was the path of least resistance since SkCanvas does not allow exporting
clip information, and there is no way to reset the bitmap storage inside an
SkCanvas temporarily.
However, this can cause a feedback cycle of unnecessary WaitForShmem operations,
since we need to wait before we can generate the clip mask into the Skia target,
and then anything else after it needs to wait for the clip mask to finish uploading
before the Skia DT can be used again.
To alleviate this, we just allocate a new DrawTargetSkia to render the clip mask
into. We carefully clip the size of the DT so that in the common case we avoid
having to upload a surface the size of the entire framebuffer. Further, since
this is a completely different DT, we can now use an A8 format (1/4 the memory
overhead) instead of a BGRA8 format for the clip mask, which gives a further
memory usage gain.
A further complication is that we need to log the current clip stack state so
that we can replay it onto the new DrawTargetSkia. This avoids having to add
a mechanism to SkCanvas to export clip information.
Differential Revision: https://phabricator.services.mozilla.com/D157050
Certain events like waiting on a round-trip to verify that the HostWebGLContext is
done using a shmem, or pushing a Skia layer which will need to be flatten later, can
be expensive, especially if they are used many times throughout a frame. However,
we weren't currently incremening the profile counters for these situations which can
lead to accelerated rendering persisting even when it would be more judicious to
fallback to software rendering.
Differential Revision: https://phabricator.services.mozilla.com/D157049
Sometimes the clip state is thrashed when we need to temporarily override
clipping to disable it. However, in this case, the clip mask itself remains
unchanged. The current invalidation scheme doesn't discern between generation
of the clip mask itself and setting the clip state for the shader, leading to
unnecessary regeneration of the clip mask.
This code just tries to discern when this is happening so we can refresh the
clip state without having to regenerate the clip mask unless truly necessary.
Differential Revision: https://phabricator.services.mozilla.com/D157048
Sometimes we hit requests to stroke a path with a rounded line in it that can't
be accelerated inside StrokeLine. This causes it to push a layer which can be
expensive. Go through DrawPath instead in this case which will still try to
accelerate the drawing with a cached texture that does not use a layer.
Differential Revision: https://phabricator.services.mozilla.com/D156791
The clip mask might not get deleted in a timely fashion and can be quite large.
Ensure it gets deleted promptly when DrawTargetWebgl goes away.
Differential Revision: https://phabricator.services.mozilla.com/D156644
DrawTargetWebgl currently only supports aligned rectangular clips that can be approximated
with a scissor. However, many use-cases require complex clips like rounded rectangles or
not-aligned regions. We can support these cases more generally by using a mask texture that
modulates the shader color. The mask texture is generated by doing a solid fill in the Skia
target over a clear background, which is safe because the Skia target is not in use while
the WebGL target is being rendered to. This adds one unconditional texture lookup to the
shaders which shouldn't have a big performance impact. When no clip mask is needed, we just
default to using a 1x1 solid texture.
Depends on D156224
Differential Revision: https://phabricator.services.mozilla.com/D156225
Currently we only support filled glyphs in DrawTargetWebgl. PDF.js can often render PDFs
that have stroked glyphs, so support for stroked glyphs is useful to prevent fallbacks.
This just adds support for plumbing StrokeOptions through to GlyphCache.
Differential Revision: https://phabricator.services.mozilla.com/D156224
We currently don't support repeat modes in the DrawTargetWebgl's image shader.
This change makes it only explicitly accelerate clamped modes. Other extend
modes will just go to the path rasterization option which will pre-rasterize
the image as a filled path and then upload to the texture cache. This will
let us keep the clamp path simple and fast without worrying about uncommon
repeat usage for now. If it ever turns out to be the case that repeat modes
are highly necessary for performance, we can revisit this.
Differential Revision: https://phabricator.services.mozilla.com/D155860
When we are rendering dark-on-light text, we invert the bitmap after
rendering to produce a standard white-on-black mask, since we must actually
render that as black-on-white to get CoreText to produce the correct dilation.
However, when we know we're rendering bitmap fonts for emoji, we don't actually
want this inversion to happen at all. So we need to ensure bitmaps go through
the normal light-on-dark path that doesn't do this.
Differential Revision: https://phabricator.services.mozilla.com/D154777