Three more changes to your routine Petr, and the time goes from 3.4 seconds to 2.6, then 1.68, then 1.47....

The fastest version is this one:

`DIM SHARED m AS _MEM`

J& = _NEWIMAGE(800, 600, 32)

m = _MEMIMAGE(J&)

SCREEN J&

T = TIMER

FOR test = 1 TO 1000

vinceCircleFill 400, 300, 200, _RGB32(0, 255, 0)

NEXT test

PRINT TIMER - T

SUB vinceCircleFill (x AS LONG, y AS LONG, R AS LONG, C AS _UNSIGNED LONG)

x0 = R

y0 = 0

e = 0

DO WHILE y0 < x0

IF e <= 0 THEN

y0 = y0 + 1

MEM_LINE x - x0, y + y0, x + x0, y + y0, C

MEM_LINE x - x0, y - y0, x + x0, y - y0, C

e = e + 2 * y0

ELSE

MEM_LINE x - y0, y - x0, x + y0, y - x0, C

MEM_LINE x - y0, y + x0, x + y0, y + x0, C

x0 = x0 - 1

e = e - 2 * x0

END IF

LOOP

MEM_LINE x - R, y, x + R, y, C

END SUB

SUB MEM_LINE (x1t AS INTEGER, y1t AS INTEGER, x2t AS INTEGER, y2t AS INTEGER, clr AS LONG)

DEFLNG A-Z

$CHECKING:OFF

w = _WIDTH

x1 = x1t * 4: x2 = x2t * 4

y1 = y1t * w * 4: y2 = y2t * w * 4

dX = x2 - x1

dY = y2 - y1

x = x1: y = y1

IF dX >= dY THEN

inc = dY \ dX

DO WHILE x <> x2

x = x + 4

y = y + inc

_MEMPUT m, m.OFFSET + y + x, clr

LOOP

ELSE

inc = dX \ dY

DO WHILE y <> y2

x = x + inc

y = y + w

_MEMPUT m, m.OFFSET + y + x, clr

LOOP

END IF

IF x1 = x2 THEN

d = y1

DO UNTIL d > y2

_MEMPUT m, m.OFFSET + d + x1, clr

d = d + w

LOOP

END IF

IF y1 = y2 THEN

d = x1

DO UNTIL d > x2

_MEMPUT m, m.OFFSET + y1 + d, clr

d = d + 4

LOOP

END IF

$CHECKING:ON

END SUB

The first change, as I mentioned above, was to change how we think of X/Y and calculate our offset. It's no longer m.OFFSET + (w * y + x) * 4. Now it's simply m.OFFSET + y + x.... This changed the speed from 3.4 seconds to 2.6.

The second change, I also did as mentioning above: I replaced the real-precision division (/) with integer division (\). This further improved the speed from 2.6 to 1.68 seconds.

The final change, I noticed that the integer division never actually changes INSIDE the loop. dX and dY aren't changing values, so dX \ dY isn't going to ever generate any altered value. I calculated the increment ONCE before each loop with inc = dX \ dY, and then used that value inside the loop itself. Doing math ONCE is faster than doing it multiple times. This increased the speed from 1.68 seconds to 1.47.

And, when you figure we started with a process that took over 10 seconds to begin with, optimizing it down to only taking 1.47 is quite a boost in overall performance! It still doesn't compare to the speeds we see from CircleFill, which I plugged in for testing and took 0.32 seconds to do the same thing, but it's a heckuva change from what it was originally. ;)