We have a heuristic that falls back from stroking certain paths with acceleration because
both WGR and the texture cache can have slow output as the size of the path increases in
screen area. AA-Stroke tends to have better output for these paths, or at least, this heuristic
tends to have a detrimental effect on AA-Stroke usage and performance in important tests.
Lets move the heuristic to after we use AA-Stroke for now.
Differential Revision: https://phabricator.services.mozilla.com/D178549
We have a heuristic that falls back from stroking certain paths with acceleration because
both WGR and the texture cache can have slow output as the size of the path increases in
screen area. AA-Stroke tends to have better output for these paths, or at least, this heuristic
tends to have a detrimental effect on AA-Stroke usage and performance in important tests.
Lets move the heuristic to after we use AA-Stroke for now.
Differential Revision: https://phabricator.services.mozilla.com/D178549
For cases where bitmaps are compressed down to alpha textures, the underlying
assumption is that these were supposed to be treated as luminance data as well
in the shader, or rather, the alpha represents both the opacity and intensity.
We weren't properly swizzling in the shader to accomplish this. This fixes that.
Differential Revision: https://phabricator.services.mozilla.com/D176986
It seems that using a texture as a backing surface for a framebuffer
while simultaneously trying to use TexSubImage to upload to it aggravates
OpenGL driver bugs regarding the underlying representation of that
texture.
Given that I couldn't find an expedient way to preserve that optimized
upload path, instead we have to fallback to just uploading the sampling
rect of the surface to a transient texture that is then drawn via shader
to framebuffer.
Differential Revision: https://phabricator.services.mozilla.com/D176240
This just tries to address fairly random changes in the Skia API and correct
our usage of it in Moz2D and some other places.
Differential Revision: https://phabricator.services.mozilla.com/D173324
This just tries to address fairly random changes in the Skia API and correct
our usage of it in Moz2D and some other places.
Differential Revision: https://phabricator.services.mozilla.com/D173324
This just tries to address fairly random changes in the Skia API and correct
our usage of it in Moz2D and some other places.
Differential Revision: https://phabricator.services.mozilla.com/D173324
DrawTargetWebgl renders a path by uploading vertex data to the back of
a large VBO using glBufferSubData then issuing a draw call, orphaning
the buffer when it becomes full. This results in many glBufferSubData
calls being interleaved with draw calls. On Mali GPUs this causes
severe performance issues as the driver is unable to determine that
any pending draw calls do not reference the updated region of the
buffer, and therefore must create a copy of the buffer for each
update.
However, since *we* know that we never overwrite a region that is
referenced by a submitted draw call, we can force the driver to avoid
making these copies. We do so by adding a new function
UnsynchronizedBufferSubData(), which acts like BufferSubData so long
as this rule is followed. Internally, this uses glMapBufferRange with
GL_MAP_UNSYNCHRONIZED_BIT, allowing the driver to omit the extraneous
copies.
Differential Revision: https://phabricator.services.mozilla.com/D174685
This implements OP_CLEAR by ensuring the shader always outputs a mask value that
represents coverage. Clear color is then blended in proportion to the coverage
value.
Depends on D171025
Differential Revision: https://phabricator.services.mozilla.com/D172956
This implements some optimizations targeted at Canvas2D's putImageData:
1) Track whether the canvas is in the initially clear state so that we avoid
reading back from the WebGL framebuffer into the Skia framebuffer when a
fallback does occur or when a data snapshot is needed.
2) For surfaces that are too large to upload to a texture, directly use
glTexSubImage2D to draw data to the WebGL framebuffer, bypassing a separate
texture upload.
3) Disregard the surface size limits for SurfacePatterns containing a
compatible texture handle.
Differential Revision: https://phabricator.services.mozilla.com/D171773
If the user does not actually support WebGL 2, attempting to create a DrawTargetWebgl will always fail.
However, each time it goes to create a DrawTargetWebgl, it will try to initialize a WebGL 2 context again.
Each time it does this, if the user is using GLX, it will call XCreatePixmap and glXCreateContextAttribs,
which can run afoul of a libX11 race condition. This will happen on every page load where a new canvas is
encountered.
It would be better to note that the failure is non-recoverable and not try to use DrawTargetWebgl on
subsequent attempts. Here we set the sContextInitError value to indicate there are non-recoverable
errors, after which it will bypass attempting creation the next time.
This means the worst exposure the user will have to the libX11 race condition is once per process lifetime,
which should greatly reduce the incidence.
Differential Revision: https://phabricator.services.mozilla.com/D171016
If we choose to accelerate a single line path, we need to take care not to use
the line cap when the path is closed. When the path is closed, we need to use
the line join instead.
Differential Revision: https://phabricator.services.mozilla.com/D170469
Rectangle with large floating-point values can start to lose mantissa bits when transformed directly
by the vertex shader, pushed through the viewport transform, and then clipped later. OTOH, the path
fallback (WGR/AA-Stroke or Skia) does not stress floating-point precision as much, so generally
returns more correct results in these cases. This becomes noticeable when the same rectangle is
filled and then subsequently stroked, since the fill may use the rect shader, while the the stroke
will fall back to using a path.
To avoid hitting this issue, this checks if a rect's coordinate are outside as certain reasonable
limit, and if so, falls back to relying on path geometry to handle transform and clipping safely.
If within the limit, the shader precision loss doesn't noticeably impact the results so it is still
safe to use the fast-path.
Differential Revision: https://phabricator.services.mozilla.com/D170270
Skia upstream removed deprecated clip ops that could be used to replace
the clipping stack and bypass clips. We shouldn't really need to do this
anymore, as we can work around it just using public APIs.
The only SkCanvas operation that allows us to bypass clipping is
writePixels, which still allows us to implement CopySurface/putImageData.
Other instances where we were using the replace op for DrawTargetWebgl
layering support can just be worked around by creating a separate
DrawTargetSkia pointing to the same pixel data, but on which no clipping
or transforms are applied so that we can freely do drawing operations
on it to the base layer pixel data regardless of any user-applied clipping.
Differential Revision: https://phabricator.services.mozilla.com/D168039
This updates the version wpf-gpu-raster which adds support for
GPUs/drivers that use truncation instead of rounding when converting
vertices to fixed point.
It also adds the GL vendor to InitContextResult so that we can detect
AMD on macOS and tell wpf-gpu-raster that truncation is going to happen.
Differential Revision: https://phabricator.services.mozilla.com/D167503
This adds a debug indicator controlled by the pref gfx.canvas.accelerated.debug.
A green square is drawn in the upper right corner of the canvas to let us know if
acceleration is being used or not.
Differential Revision: https://phabricator.services.mozilla.com/D165018
CanvasRenderingContext2D relies upon CreateSimilarDrawTarget to create extract
a subrect from a surface to draw. However, DrawTargetWebgl does not return an
accelerated DT for that API as creating an entirely new context can be quite
expensive.
To work around this, this adds a specific ExtractSubrect API for SourceSurface
that can bypass the entire need to create a temporary DrawTarget to copy into.
Differential Revision: https://phabricator.services.mozilla.com/D164118
This pre-allocates a vertex output buffer in DrawTargetWebgl so that we can generate
wpf-gpu-raster and aa-stroke output into it. This way, they don't have to realloc
a Vec for pushes or changing into a boxed slice. This can net 5-10% on profiles for
the demos noted in the bug.
Depends on D163989
Differential Revision: https://phabricator.services.mozilla.com/D163990
This increases the amount of quantization applied to path cache entries
to 0.25 increments (2 bits of subpixel precision), leading to 4x4=16
subpixel buckets in total along both axes.
Differential Revision: https://phabricator.services.mozilla.com/D163749
This fixes an incidental bug with the pref to turn off GPU stroking. It's not supposed to disable
caching strokes as textures. This allows that to work again even if prefed off.
Differential Revision: https://phabricator.services.mozilla.com/D163307
It seems like this is slow for now until we implement a better way than WPF-gpu-raster
for stroking paths. Just hide this behind a pref so we can at least test it but not
impact performance as badly.
Differential Revision: https://phabricator.services.mozilla.com/D163248
For use-cases that repeatedly pop and re-push the same clips over and over, we can regenerate the
same mask that is already still stored, because we only detect that clip state changed, rather than
that it changed to exactly the same state it was previously.
This just remembers the previous state of the clip stack at the time the clip mask was generated
so that we can compare the previous and current state. If they're the same, we can assume there
is no need to regenerate the clip mask again and simply reuse it.
Differential Revision: https://phabricator.services.mozilla.com/D162699
WebGL doesn't reliably implement line smoothing, so we can't rely on it, making it
useless for canvas lines. Instead, just fall back to emulating it manually with paths.
Differential Revision: https://phabricator.services.mozilla.com/D162540
Some paths may contain so many types that their vertex representation far exceeds their
software rasterized representation in memory size. As a sanity-check, we should just set
a hard limit on the maximum allowed complexity of a path that we attempt to supply to
wpf-gpu-raster. Beyond that, we will instead just rasterize in software and upload
to a texture which can be more performant.
Differential Revision: https://phabricator.services.mozilla.com/D162481
By default, BorrowSnapshot is pessimistic and forces DrawTargetWebgl to return a data snapshot on
the assumption that the snapshot might be used off thread. However, if we actually know the DrawTarget
we're going to be drawing the snapshot to, then we can check if they're both DrawTargetWebgls with
the same internal SharedContext. In that case, we can use a SourceSurfaceWebgl snapshot which can
pass through a GPU texture to the target. This requires us to plumb the DrawTarget down through
SurfaceFromElement all the way to DrawTargetWebgl to make this decision.
Differential Revision: https://phabricator.services.mozilla.com/D162176
This adds a path vertex buffer where triangle list output from WGR is stored.
Each PathCacheEntry can potentially reference a range of vertexes in this buffer
corresponding to triangles for that entry. When this buffer is full, it gets
orphaned and clears corresponding cache entries, so that it can start anew.
Differential Revision: https://phabricator.services.mozilla.com/D161479
This adds a path vertex buffer where triangle list output from WGR is stored.
Each PathCacheEntry can potentially reference a range of vertexes in this buffer
corresponding to triangles for that entry. When this buffer is full, it gets
orphaned and clears corresponding cache entries, so that it can start anew.
Differential Revision: https://phabricator.services.mozilla.com/D161479