Author Topic: Is this fast enough as general circle fill?  (Read 863 times)

Re: Is this fast enough as general circle fill?
« Reply #30 on: June 26, 2018, 04:34:18 PM »
Sometimes, computers do things that are completely counter-intuitive to us, and we find ourselves having to step back as programmers and simply say, "WOW!!"   Here's a perfect example of that:

Code: [Select]
DIM SHARED Radius AS INTEGER: Radius = 1000
DIM SHARED CP(Radius, Radius) AS INTEGER 'CirclePoints


SCREEN _NEWIMAGE(800, 600, 256)
COLOR 15


PreCalcCircles

x = _WIDTH / 2
y = _HEIGHT / 2
k = 15
TestLoopLimit = 10000

t1 = TIMER(0.001)
FOR i = 1 TO TestLoopLimit
    CircleFillFast x, y, 300, k
NEXT

t2 = TIMER(0.001)
FOR i = 1 TO TestLoopLimit
    CircleFill x, y, 300, k
NEXT
t3 = TIMER(0.001)

PRINT USING "##.#### seconds with CircleFillFast"; t2 - t1
PRINT USING "##.#### seconds with CircleFill"; t3 - t2


SUB PreCalcCircles
    FOR i = 0 TO Radius 'each circle, for all radius sizes from 1 to limit
        FOR j = 0 TO i 'get the points for each line of those circles
            CP(i, j) = SQR(i * i - j * j)
        NEXT
    NEXT
END SUB


SUB CircleFillFast (x, y, Radius, k)
    FOR j = 0 TO Radius 'get the points for each line of those circles
        LINE (x - CP(Radius, j), y + j)-(x + CP(Radius, j), y + j), k, BF
        LINE (x - CP(Radius, j), y - j)-(x + CP(Radius, j), y - j), k, BF
    NEXT
END SUB

SUB CircleFill (CX AS LONG, CY AS LONG, R AS LONG, C AS LONG)
    DIM Radius AS LONG, RadiusError AS LONG
    DIM X AS LONG, Y AS LONG

    Radius = ABS(R)
    RadiusError = -Radius
    X = Radius
    Y = 0

    IF Radius = 0 THEN PSET (CX, CY), C: EXIT SUB

    ' Draw the middle span here so we don't draw it twice in the main loop,
    ' which would be a problem with blending turned on.
    LINE (CX - X, CY)-(CX + X, CY), C, BF

    WHILE X > Y
        RadiusError = RadiusError + Y * 2 + 1
        IF RadiusError >= 0 THEN
            IF X <> Y + 1 THEN
                LINE (CX - Y, CY - X)-(CX + Y, CY - X), C, BF
                LINE (CX - Y, CY + X)-(CX + Y, CY + X), C, BF
            END IF
            X = X - 1
            RadiusError = RadiusError - X * 2
        END IF
        Y = Y + 1
        LINE (CX - X, CY - Y)-(CX + X, CY - Y), C, BF
        LINE (CX - X, CY + Y)-(CX + X, CY + Y), C, BF
    WEND
END SUB

Here we look at two different circle fill routines -- one, which I'd assume to be faster, which precalculates the offset needed to find the endpoints for each line which composes a circle, and another, which is the same old CircleFill program which I've shared countless times over the years with people on various QB64 forums.

When all is said and done though, CircleFill is STILL even faster than CircleFillFast, which pregenerates those end-points for us!

I've got to admit, I find these results rather shocking!  (Thus the name I chose when naming the CircleFill-NotSoFastAfterall routine.)  Apparently, in this case, the integer math used in CircleFill is faster than the time it takes for QB64 to look up those internal values from a preset array.

Who woulda thunk it?!!

Anywho, I thought I'd share, just so others could look over the two routines and compare.  Maybe there's a way to improve the CircleFillFast so that it'd be faster than CircleFill, but if it is, I'm not seeing it at the moment.

It looks like CircleFill is still the fastest routine to use to rapidly fill a circle for us.  :D

Re: Is this fast enough as general circle fill?
« Reply #31 on: June 26, 2018, 04:39:20 PM »
Code: [Select]
DIM SHARED Radius AS INTEGER: Radius = 1000
DIM SHARED CP(Radius, Radius) AS INTEGER 'CirclePoints


SCREEN _NEWIMAGE(800, 600, 256)
COLOR 15


PreCalcCircles

x = _WIDTH / 2
y = _HEIGHT / 2
k = 15
TestLoopLimit = 10000

t1 = TIMER(0.001)
FOR i = 1 TO TestLoopLimit
    CircleFillFast x, y, 300, k
NEXT
SLEEP

t2 = TIMER(0.001)
FOR i = 1 TO TestLoopLimit
    CircleFill x, y, 300, k
NEXT

t3 = TIMER(0.001)

PRINT USING "##.#### seconds with CircleFillFast"; t2 - t1
PRINT USING "##.#### seconds with CircleFill"; t3 - t2


SUB PreCalcCircles
    FOR i = 0 TO Radius 'each circle, for all radius sizes from 1 to limit
        FOR j = 0 TO i 'get the points for each line of those circles
            CP(i, j) = SQR(i * i - j * j)
        NEXT
    NEXT
END SUB


SUB CircleFillFast (x, y, r, k)
    DO UNTIL j > r
        t = CP(r, j)
        LINE (x - t, y + j)-(x + t, y + j), k, BF
        LINE (x - t, y - j)-(x + t, y - j), k, BF
        j = j + 1
    LOOP
END SUB

SUB CircleFill (CX AS LONG, CY AS LONG, R AS LONG, C AS LONG)
    DIM Radius AS LONG, RadiusError AS LONG
    DIM X AS LONG, Y AS LONG

    Radius = ABS(R)
    RadiusError = -Radius
    X = Radius
    Y = 0

    IF Radius = 0 THEN PSET (CX, CY), C: EXIT SUB

    ' Draw the middle span here so we don't draw it twice in the main loop,
    ' which would be a problem with blending turned on.
    LINE (CX - X, CY)-(CX + X, CY), C, BF

    WHILE X > Y
        RadiusError = RadiusError + Y * 2 + 1
        IF RadiusError >= 0 THEN
            IF X <> Y + 1 THEN
                LINE (CX - Y, CY - X)-(CX + Y, CY - X), C, BF
                LINE (CX - Y, CY + X)-(CX + Y, CY + X), C, BF
            END IF
            X = X - 1
            RadiusError = RadiusError - X * 2
        END IF
        Y = Y + 1
        LINE (CX - X, CY - Y)-(CX + X, CY - Y), C, BF
        LINE (CX - X, CY + Y)-(CX + X, CY + Y), C, BF
    WEND

END SUB

And this version is quite a bit faster than the previous one -- but still slower than the original CircleFill routine.  The only real change here for performance?  Using         t = CP(r, j)     and using t in our circle drawing routines instead of CP(r,j).  Referencing arrays are slower than referencing single variables, and we see the change in performance here, noticeably.


Re: Is this fast enough as general circle fill?
« Reply #32 on: June 26, 2018, 04:45:50 PM »
*Remove the SLEEP statement in the above routine, which I was using to test things, or else you'll just sit there and not do much of anything while waiting for the program to complete.  I'd placed it there while testing changes and can't go back and edit the above post to remove it.  :P

Re: Is this fast enough as general circle fill?
« Reply #33 on: June 26, 2018, 05:05:11 PM »
Hi Steve. NUMERIC TYPES!

Try it now:

Code: [Select]
DIM SHARED Radius AS INTEGER: Radius = 1000
DIM SHARED CP(Radius, Radius) AS INTEGER 'CirclePoints


SCREEN _NEWIMAGE(800, 600, 256)
COLOR 15


PreCalcCircles

x = _WIDTH / 2
y = _HEIGHT / 2
k = 15
TestLoopLimit = 10000

t1 = TIMER(0.001)
FOR i = 1 TO TestLoopLimit
    CircleFillFast x#, y#, 300, k#
NEXT


t2 = TIMER(0.001)
FOR i = 1 TO TestLoopLimit
    CircleFill x, y, 300, k
NEXT

t3 = TIMER(0.001)

PRINT USING "##.#### seconds with CircleFillFast"; t2 - t1
PRINT USING "##.#### seconds with CircleFill"; t3 - t2


SUB PreCalcCircles
    DIM i AS INTEGER, j AS INTEGER
    FOR i = 0 TO Radius 'each circle, for all radius sizes from 1 to limit
        FOR j = 0 TO i 'get the points for each line of those circles
            CP(i, j) = SQR(i * i - j * j)
        NEXT
    NEXT
END SUB


SUB CircleFillFast (x AS INTEGER, y AS INTEGER, r AS INTEGER, k AS INTEGER)
    REDIM t AS INTEGER, j AS INTEGER
    DO UNTIL j > r
        t = CP(r, j)
        LINE (x - t, y + j)-(x + t, y + j), k, BF
        LINE (x - t, y - j)-(x + t, y - j), k, BF
        j = j + 1
    LOOP
END SUB

SUB CircleFill (CX AS LONG, CY AS LONG, R AS LONG, C AS LONG)
    DIM Radius AS LONG, RadiusError AS LONG
    DIM X AS LONG, Y AS LONG

    Radius = ABS(R)
    RadiusError = -Radius
    X = Radius
    Y = 0

    IF Radius = 0 THEN PSET (CX, CY), C: EXIT SUB

    ' Draw the middle span here so we don't draw it twice in the main loop,
    ' which would be a problem with blending turned on.
    LINE (CX - X, CY)-(CX + X, CY), C, BF

    WHILE X > Y
        RadiusError = RadiusError + Y * 2 + 1
        IF RadiusError >= 0 THEN
            IF X <> Y + 1 THEN
                LINE (CX - Y, CY - X)-(CX + Y, CY - X), C, BF
                LINE (CX - Y, CY + X)-(CX + Y, CY + X), C, BF
            END IF
            X = X - 1
            RadiusError = RadiusError - X * 2
        END IF
        Y = Y + 1
        LINE (CX - X, CY - Y)-(CX + X, CY - Y), C, BF
        LINE (CX - X, CY + Y)-(CX + X, CY + Y), C, BF
    WEND

END SUB

Coding is relax (At least sometimes)

Re: Is this fast enough as general circle fill?
« Reply #34 on: June 26, 2018, 05:12:18 PM »
Just changing the types for     CircleFillFast x#, y#, 300, k# , breaks the program.  There is no x# or y# in use anywhere, so you're placing the circle at 0,0 with color 0....  75% of it is drawn off the screen, and that's going to make it appear to run faster than it actually is.

You can't compare speeds of drawing 1/4 of a circle, against speeds of drawing a whole circle.  ;)

You'd need to change all instances of X/Y to become single precision values, and then you end up with similar speed results as before.

Code: [Select]
DIM SHARED Radius AS INTEGER: Radius = 1000
DIM SHARED CP(Radius, Radius) AS INTEGER 'CirclePoints


SCREEN _NEWIMAGE(800, 600, 256)
COLOR 15


PreCalcCircles

x# = _WIDTH / 2
y# = _HEIGHT / 2
k# = 15
TestLoopLimit = 10000

t1 = TIMER(0.001)
FOR i = 1 TO TestLoopLimit
    CircleFillFast x#, y#, 300, k#
NEXT


t2 = TIMER(0.001)
FOR i = 1 TO TestLoopLimit
    CircleFill x#, y#, 300, k#
NEXT

t3 = TIMER(0.001)

PRINT USING "##.#### seconds with CircleFillFast"; t2 - t1
PRINT USING "##.#### seconds with CircleFill"; t3 - t2


SUB PreCalcCircles
    DIM i AS INTEGER, j AS INTEGER
    FOR i = 0 TO Radius 'each circle, for all radius sizes from 1 to limit
        FOR j = 0 TO i 'get the points for each line of those circles
            CP(i, j) = SQR(i * i - j * j)
        NEXT
    NEXT
END SUB


SUB CircleFillFast (x AS INTEGER, y AS INTEGER, r AS INTEGER, k AS INTEGER)
    REDIM t AS INTEGER, j AS INTEGER
    DO UNTIL j > r
        t = CP(r, j)
        LINE (x - t, y + j)-(x + t, y + j), k, BF
        LINE (x - t, y - j)-(x + t, y - j), k, BF
        j = j + 1
    LOOP
END SUB

SUB CircleFill (CX AS LONG, CY AS LONG, R AS LONG, C AS LONG)
    DIM Radius AS LONG, RadiusError AS LONG
    DIM X AS LONG, Y AS LONG

    Radius = ABS(R)
    RadiusError = -Radius
    X = Radius
    Y = 0

    IF Radius = 0 THEN PSET (CX, CY), C: EXIT SUB

    ' Draw the middle span here so we don't draw it twice in the main loop,
    ' which would be a problem with blending turned on.
    LINE (CX - X, CY)-(CX + X, CY), C, BF

    WHILE X > Y
        RadiusError = RadiusError + Y * 2 + 1
        IF RadiusError >= 0 THEN
            IF X <> Y + 1 THEN
                LINE (CX - Y, CY - X)-(CX + Y, CY - X), C, BF
                LINE (CX - Y, CY + X)-(CX + Y, CY + X), C, BF
            END IF
            X = X - 1
            RadiusError = RadiusError - X * 2
        END IF
        Y = Y + 1
        LINE (CX - X, CY - Y)-(CX + X, CY - Y), C, BF
        LINE (CX - X, CY + Y)-(CX + X, CY + Y), C, BF
    WEND

END SUB

Re: Is this fast enough as general circle fill?
« Reply #35 on: June 26, 2018, 05:26:52 PM »
The best speed I can tend to generate with tweaking things is this (so far): 

Code: [Select]
_DEFINE A-Z AS LONG
DIM SHARED Radius AS INTEGER: Radius = 1000
DIM SHARED CP(Radius, Radius) AS INTEGER 'CirclePoints


SCREEN _NEWIMAGE(800, 600, 256)
COLOR 15


PreCalcCircles

x = _WIDTH / 2
y = _HEIGHT / 2
k = 15
TestLoopLimit = 10000

t1# = TIMER(0.001)
FOR i = 1 TO TestLoopLimit
    CircleFillFast x, y, 300, k
NEXT


t2# = TIMER(0.001)
FOR i = 1 TO TestLoopLimit
    CircleFill x, y, 300, k
NEXT

t3# = TIMER(0.001)

PRINT USING "##.#### seconds with CircleFillFast"; t2# - t1#
PRINT USING "##.#### seconds with CircleFill"; t3# - t2#


SUB PreCalcCircles
    FOR i = 0 TO Radius 'each circle, for all radius sizes from 1 to limit
        FOR j = 0 TO i 'get the points for each line of those circles
            CP(i, j) = SQR(i * i - j * j)
        NEXT
    NEXT
END SUB


SUB CircleFillFast (x, y, r, k)
    dc = _DEFAULTCOLOR
    COLOR k
    DO UNTIL j > r
        t = CP(r, j)
        LINE (x - t, y + j)-STEP(t + t, 0), , BF
        LINE (x - t, y - j)-STEP(t + t, 0), , BF
        j = j + 1
    LOOP
    COLOR dc
END SUB

SUB CircleFill (CX AS LONG, CY AS LONG, R AS LONG, C AS LONG)
    DIM Radius AS LONG, RadiusError AS LONG
    DIM X AS LONG, Y AS LONG

    Radius = ABS(R)
    RadiusError = -Radius
    X = Radius
    Y = 0

    IF Radius = 0 THEN PSET (CX, CY), C: EXIT SUB
    ' Draw the middle span here so we don't draw it twice in the main loop,
    ' which would be a problem with blending turned on.
    LINE (CX - X, CY)-(CX + X, CY), C, BF

    WHILE X > Y
        RadiusError = RadiusError + Y * 2 + 1
        IF RadiusError >= 0 THEN
            IF X <> Y + 1 THEN
                LINE (CX - Y, CY - X)-(CX + Y, CY - X), C, BF
                LINE (CX - Y, CY + X)-(CX + Y, CY + X), C, BF
            END IF
            X = X - 1
            RadiusError = RadiusError - X * 2
        END IF
        Y = Y + 1
        LINE (CX - X, CY - Y)-(CX + X, CY - Y), C, BF
        LINE (CX - X, CY + Y)-(CX + X, CY + Y), C, BF
    WEND

END SUB

Speeds on my machine are 0.55 seconds versus 0.50 seconds, so both of them are plenty fast for most instances.  (After all, this is drawing a filled circle the whole size of my screen 10,000 times.)

Maybe if someone wanted/needed the endpoints for collision detection or some other use, then the precalculated values in CircleFillFast would be useful.  As it is though, I suppose I'm still going to have to call CircleFill the quickest routine I've personally seen so far. 

Re: Is this fast enough as general circle fill?
« Reply #36 on: June 26, 2018, 05:47:47 PM »
You're right. I did not realize that. Whatever I'm trying to do, it's not better.
Coding is relax (At least sometimes)

Re: Is this fast enough as general circle fill?
« Reply #37 on: June 26, 2018, 05:54:19 PM »
Steve, can you please look to this source, why it is so slow? https://www.qb64.org/forum/index.php?topic=287.msg1756#msg1756
Coding is relax (At least sometimes)

Re: Is this fast enough as general circle fill?
« Reply #38 on: June 26, 2018, 06:06:03 PM »
Steve, can you please look to this source, why it is so slow? https://www.qb64.org/forum/index.php?topic=287.msg1756#msg1756

Try it with $CHECKING:OFF.

I'll give it a better looking over later tonight, but the wife has me heading out for supper here in a few moments.  $CHECKING might make a noticeable boost in performance for you though. 

Re: Is this fast enough as general circle fill?
« Reply #39 on: June 26, 2018, 06:09:53 PM »
I'll try it. Thank you. I'm going to sleep, it's almost midnight here. I'll see it tomorrow.
Coding is relax (At least sometimes)

Re: Is this fast enough as general circle fill?
« Reply #40 on: June 26, 2018, 06:16:02 PM »
Advice 2:

Remove:

FUNCTION IN& (x AS INTEGER, y AS INTEGER)
    IN& = (_WIDTH * y + x) * 4
END FUNCTION


Function calls are slow.  Do the math directly, and don't call on _WIDTH.

Early in the routine, use w = _WIDTH, then:

  _MEMPUT m, m.OFFSET + (w * y1 + x) * 4, clr&

Re: Is this fast enough as general circle fill?
« Reply #41 on: June 26, 2018, 07:24:58 PM »
Fellippe, when you started with comparing time, which is very beneficial to all, I give this code here. I made the analogy of the LINE command via MEMPUT. Even though I expected great speed, but  the result is to cry. It's a terrible lemur lazy. Could you tell me what I was doing here bad again? Thank you.

Code: [Select]

DIM SHARED m AS _MEM
J& = _NEWIMAGE(800, 600, 32)
m = _MEMIMAGE(J&)
SCREEN J&

T = TIMER

FOR test = 1 TO 1000
    vinceCircleFill 400, 300, 200, _RGB32(0, 255, 0)
NEXT test

PRINT TIMER - T

SUB vinceCircleFill (x AS LONG, y AS LONG, R AS LONG, C AS _UNSIGNED LONG)
    x0 = R
    y0 = 0
    e = 0
    DO WHILE y0 < x0
        IF e <= 0 THEN
            y0 = y0 + 1
            MEM_LINE x - x0, y + y0, x + x0, y + y0, C
            MEM_LINE x - x0, y - y0, x + x0, y - y0, C
            e = e + 2 * y0
        ELSE
            MEM_LINE x - y0, y - x0, x + y0, y - x0, C
            MEM_LINE x - y0, y + x0, x + y0, y + x0, C
            x0 = x0 - 1
            e = e - 2 * x0
        END IF
    LOOP
    MEM_LINE x - R, y, x + R, y, C
END SUB

SUB MEM_LINE (x1 AS INTEGER, y1 AS INTEGER, x2 AS INTEGER, y2 AS INTEGER, clr AS LONG)
    DEFLNG A-Z
    dX = x2 - x1
    dY = y2 - y1
    IF dX > dY OR dX = dY THEN
        x = x1: y = y1
        DO WHILE x <> x2
            x = x + 1
            y = y + (dY / dX)
            _MEMPUT m, m.OFFSET + IN&(x, y), clr&
        LOOP
    END IF
    IF dY > dX THEN
        x = x1: y = y1
        DO WHILE y <> y2
            x = x + (dX / dY)
            y = y + 1
            _MEMPUT m, m.OFFSET + IN&(x, y), clr&
        LOOP
    END IF
    IF x1 = x2 THEN
        FOR d = y1 TO y2
            _MEMPUT m, m.OFFSET + IN&(x1, d), clr&
        NEXT d
    END IF
    IF y1 = y2 THEN
        FOR d = x1 TO x2
            _MEMPUT m, m.OFFSET + IN&(d, y1), clr&
        NEXT d
    END IF
END SUB

FUNCTION IN& (x AS INTEGER, y AS INTEGER)
    IN& = (_WIDTH * y + x) * 4
END FUNCTION


A few tweaks to the above code, and you'll see a slight improvement on run time:
Code: [Select]
DIM SHARED m AS _MEM
J& = _NEWIMAGE(800, 600, 32)
m = _MEMIMAGE(J&)
SCREEN J&

T = TIMER

FOR test = 1 TO 1000
    vinceCircleFill 400, 300, 200, _RGB32(0, 255, 0)
NEXT test

PRINT TIMER - T

SUB vinceCircleFill (x AS LONG, y AS LONG, R AS LONG, C AS _UNSIGNED LONG)
    x0 = R
    y0 = 0
    e = 0
    DO WHILE y0 < x0
        IF e <= 0 THEN
            y0 = y0 + 1
            MEM_LINE x - x0, y + y0, x + x0, y + y0, C
            MEM_LINE x - x0, y - y0, x + x0, y - y0, C
            e = e + 2 * y0
        ELSE
            MEM_LINE x - y0, y - x0, x + y0, y - x0, C
            MEM_LINE x - y0, y + x0, x + y0, y + x0, C
            x0 = x0 - 1
            e = e - 2 * x0
        END IF
    LOOP
    MEM_LINE x - R, y, x + R, y, C
END SUB

SUB MEM_LINE (x1 AS INTEGER, y1 AS INTEGER, x2 AS INTEGER, y2 AS INTEGER, clr AS LONG)
    DEFLNG A-Z
    $CHECKING:OFF
    dX = x2 - x1
    dY = y2 - y1
    w = _WIDTH
    x = x1: y = y1
    IF dX >= dY THEN
        DO WHILE x <> x2
            x = x + 1
            y = y + (dY / dX)
            _MEMPUT m, m.OFFSET + (w * y + x) * 4, clr
        LOOP
    ELSE
        DO WHILE y <> y2
            x = x + (dX / dY)
            y = y + 1
            _MEMPUT m, m.OFFSET + (w * y + x) * 4, clr
        LOOP
    END IF
    IF x1 = x2 THEN
        d = y1
        DO UNTIL d > y2
            _MEMPUT m, m.OFFSET + (w * d + x1) * 4, clr
            d = d + 1
        LOOP
    END IF
    IF y1 = y2 THEN
        d = x1
        DO UNTIL d > x2
            _MEMPUT m, m.OFFSET + (w * y1 + d) * 4, clr
            d = d + 1
        LOOP
    END IF
    $CHECKING:ON
END SUB

As it was, it took 10.3 seconds to draw 1000 times on my machine.  (I changed the limit from 10,000 to 1,000 so I wouldn't have to wait all night for results.)  The modified version does the same thing in 3.4 seconds.

$CHECKING makes a small change, with 4.3 seconds vs 3.4 seconds for run time, but the greatest change comes from dropping the multiple FUNCTION calls.  They're SLOOOOOOW in relation to other things in QB64 and should generally be avoided as much as possible in situations where speed might be a critical consideration for your program.  Changing the FOR-NEXT loops to DO-LOOP saves a little time as well, though it's mainly one of those things where you'd worry about altering and using AFTER you're certain the code runs right the first time around for you. 

Still, I don't think it'll compare to the times of LINE (x1,y1)-(x2,y2), kolor, BF, since LINE BF has been extensively optimized at the C-level by Galleon, but it's still a nice improvement overall for us.  :)

Re: Is this fast enough as general circle fill?
« Reply #42 on: June 27, 2018, 11:40:52 AM »
Steve, thank you for your pretty detailed analysis of the problem. Never thought me to try the reaction time of the functions. It's a shame that such a useful thing as the functions have such a reaction time.
Coding is relax (At least sometimes)

Re: Is this fast enough as general circle fill?
« Reply #43 on: June 27, 2018, 12:36:50 PM »
Looking at it fresh this morning, I see another way you could speed it up even more, with the use of some creative math.

        DO WHILE y <> y2
            x = x + (dX / dY)
            y = y + 1
            _MEMPUT m, m.OFFSET + (w * y + x) * 4, clr
        LOOP

If you make y1 and y2 in relation to w to begin with, you could reduce a few math operations in the DO LOOPS...

Y1 = Y1 * w: Y2 = Y2 * w     <---- first line at the start of the SUB

Y = Y + w   <----  in the DO LOOP
_MEMPUT m, m.offset + (y + x) *4, clr    <----  internally in the routine now, Y increments by _WIDTH amount naturally, and we remove a multiplication step of operations.


*************

Since X and Y are interger values, you can probably change the math to be a little more efficient elsewhere as well:

Instead of X = X + (DX / DY), make it X = X + (DX \ DY)

Integer division (\) is considerably faster than real division (/), so if you can use it and need a program to optimize speed, do so. 

****************

Lots of little tricks and tweaks which can be used to optimize speed.  The main thing you have to be *really* careful of is not to obfuscate the code beyond the point of being able to understand it in the future.  Just because you *can* make it faster, it doesn't mean you always *should* -- especially if you alter it so much you can't figure it out and debug it or alter it, at a later date.

Fast is good, but understanding is better.  ;D

Re: Is this fast enough as general circle fill?
« Reply #44 on: June 27, 2018, 12:45:28 PM »
In fact, you may be able to remove that * 4 operation completely, if you multiple the X/Y/W values by 4 at the beginning of the program as well.

Then it'd just be a case of _MEMPUT m, m.offset + y + x, clr, which would save several math operations for each DO..LOOP.