You are not logged in or registered. Please login or register to use the full functionality of this board...
SIGN IN Join Our Community For FREE


My take on a new LINE INPUT system
07-12-2014, 04:39 PM
Post: #1
 (Print Post)
My take on a new LINE INPUT system
See if this sounds good:

SUB _TEST_LINE INPUT (filehandle, varable$, line-ending$)

Since it's a line input, we only need the single variable$ parameter, correct?   We can't LINE INPUT #1, foo$, foo2$, foo3$, can we?

The way I see it, we'll open the file with filehandle and then read a disk sector at a time.   

With windows we can use Function GetDiskSpace to  find the exact sector size so we can set our read buffer to 512 or 4096 bytes as best supported by the users hardware.  The larger the buffer, the less likely we'll need multiple reads from the disk.  Does Linux have a similar command?  If not, it could default to 512 bytes and still be a TON faster than the "read a byte at a time" approach that QB64 is using.

Once we get the string, we just parse it for the EOl marker.  CHR$(10), CHR$(13) should be searched for by default, but I don't see why we couldn't allow the user to pass an optional parameter to define their own.  (Like say CHR$(0) if one wishes.)

If we don't find that EOF marker, we just grab another sector and append it to the first string and repeat the process until either we find what we're looking for, or else hit the EOF marker.  One we find it, we just set the file position pointer to 1 byte after the point where we found our EOL marker so we start the next read at the proper position.

For QB64 users, it'd work just as they've always had it work (but with that additional optional parameter), but the end result would be tons faster.  TONS and TONS faster with long strings: the longer the greater the efficiency savings becomes...

What do you guys think?   Anything I'm missing that we should add into the process there?  Something to add flexibility?  Or do you have a better underlying procedure to use?

I'll work up a proof of concept in BASIC real soon (I've did this 1000 times in the past and can get a BAS demo out with about 15 minutes typing.), and you guys can see the difference in speed.  If writing the routine in BAS source is 40-50 times faster than what we have already, how much further can we increase the performance by having it in pure C?
Find all posts by this user
Like Post
07-13-2014, 03:17 AM
Post: #2
 (Print Post)
RE: My take on a new LINE INPUT system
Here's a proof of concept routine for use in QB64:

Code Snippet: [Select]
file$ = "temp.txt"

$CHECKING:OFF


OPEN file$ FOR OUTPUT AS #1
FOR i = 1 TO 100000
    test$ = "This is just a big ole pile of Junk Data for Testing #" + STR$(i)
    PRINT #1, test$
NEXT
CLOSE

'Timing a input # routine of our data
t# = TIMER(0.001)
OPEN file$ FOR INPUT AS #1
DO
    LINE INPUT #1, junk$
    '    PRINT junk$     'Unremark this line to see that we're getting the same info
LOOP UNTIL EOF(1)
CLOSE
t2# = TIMER(0.001)


OPEN file$ FOR BINARY AS #1
DO
    GetInput 1, junk$, ""
    '    PRINT junk$     'Unremark this line to see that we're getting the same info
LOOP UNTIL EOF(1)
CLOSE
t3# = TIMER(0.001)


result$ = "###.###### seconds with & method"
r# = t2# - t#
PRINT USING result$; r#; "INPUT #"
r# = t3# - t2#
PRINT USING result$; r#; "GETINPUT #"



KILL file$ 'clean up afterwards


SUB GetInput (filehandle, text AS STRING, CRLF AS STRING)


FileBuffer = 512 '        Set this number to match the sector size of your hard drive.
'                               Then we can read as much data as possible each spin of the disk head,
'                               and not need extra reads in the future.
IF CRLF = "" THEN
    IF INSTR(UCASE$(_OS$), "WIN") THEN CRLF = CHR$(13) + CHR$(10) ELSE CRLF = CHR$(10)
END IF
StartByte = SEEK(filehandle)
l = LOF(filehandle)
limit = l - StartByte
IF FileBuffer > limit THEN FileBuffer = limit
temp$ = SPACE$(FileBuffer): text = ""
DO
    GET #filehandle, StartByte, temp$
    x = INSTR(temp$, CRLF)
    IF x = 0 THEN
        StartByte = StartByte + FileBuffer
        text = text + temp$
    ELSE
        StartByte = StartByte + x - 1
        text = text + LEFT$(temp$, x - 1)
        EXIT DO
    END IF
LOOP UNTIL EOF(filehandle)

SEEK filehandle, StartByte + LEN(CRLF)
END SUB

Notice this behaves EXACTLY like LINE INPUT does, but for Windows users, this routine is about 20 times faster than LINE INPUT.   Even for Linux, it seems about 3 times faster (and Linux works a lot better than Windows in this regard).  

I think all-in-all, it goes to show just how weak the built in QB64 command is, and how much it needs to be improved...   Now to just try and get a C version up and going of it.  Wink
Find all posts by this user
Like Post
07-09-2017, 09:11 AM (This post was last modified: 07-09-2017 09:14 AM by smokingwheels.)
Post: #3
 (Print Post)
RE: My take on a new LINE INPUT system
Thats a good improvement.
I have given up processing long strings some of them are around or over 1 million char in length.

There was a backwards compatibility issue with developers and something like your example was never implemented, its on the qb64.net forum which is down ATM.

Here is a way to just open the whole file as a string.  I used it on my web server and the site load time went from 2 minutes to ~3 seconds AU to US.

I will try and put it in a code window I cant seem to get it to work.Um the blue box disappears.

pre.cjk { font-family: "Courier New",monospace; }p { margin-bottom: 0.25cm; line-height: 120%; }


loaddata:
file$ = "/home/john/Downloads/qb64/runme1.txt"
OPEN file$ FOR BINARY AS #1
m$ = SPACE$(LOF(1))
GET #1, , m$
CLOSE #1
PRINT "Data Loaded "
RETURN
Find all posts by this user
Like Post
07-09-2017, 10:04 AM (This post was last modified: 07-09-2017 10:06 AM by smokingwheels.)
Post: #4
 (Print Post)
RE: My take on a new LINE INPUT system
Shy I tried a file 1 Million Lines long with 1 char and the Getinput is slightly slower.
However the Getinput works like lightning for 1 Million char on one line.
I run Linux.

Line input = 181.912 Seconds 
Get input  =      .387 Seconds

Different data "0123456789"
192.290
   0.397

So to put it another way it could save me over 440 Computing Days with my old PC.
Find all posts by this user
Like Post



Forum Jump:


User(s) browsing this thread: 1 Guest(s)




QB64 Member Project - Spinning Color Wheel
QB64 Member Project - Score 4
QB64 Member Project - Spiro Roses
QB64 Member Project - Pivet version one
QB64 Member Project - OpenGL Triangles
QB64 Member Project - Red Scrolling LED Sign
QB64 Member Project - Kobolts Monopoly
QB64 Member Project - Exit
QB64 Member Project - Point Blank
QB64 Member Project - Algeria Weather
QB64 Member Project - Dreamy Clock
QB64 Member Project - Pivot version two
QB64 Member Project - Kings Valley verion one
QB64 Member Project - Splatter
QB64 Member Project - Rubix's Magic
QB64 Member Project - Overboard
QB64 Member Project - Martin Fractals version three
QB64 Member Project - Inside Moves
QB64 Member Project - Color Rotating Text
QB64 Member Project - Rotating Background
QB64 Member Project - 9 Board
QB64 Member Project - Connect Four
QB64 Member Project - Amazon
QB64 Member Project - Full Color LED Sign
QB64 Member Project - Martin Fractals version four
QB64 Member Project - Quarto
QB64 Member Project - Sabotage
QB64 Member Project - Qubic
QB64 Member Project - Foursight
QB64 Member Project - Isolation
QB64 Member Project - Kings Vallery version two
QB64 Member Project - Touche
QB64 Member Project - Line Thickness
QB64 Member Project - Othello
QB64 Member Project - Martin Fractals version one
QB64 Member Project - Basic Dithering
QB64 Member Project - Input
QB64 Member Project - Blokus
QB64 Member Project - Bowditch curve
QB64 Member Project - Martin Fractals version two
QB64 Member Project - ARB Checkers
QB64 Member Project - Color Triangles
QB64 Member Project - Swirl
QB64 Member Project - Kings Court
QB64 Member Project - RGB Color Wheel
QB64 Member Project - Domain
QB64 Member Project - Dakapo
QB64 Member Project - MAPTRIANGLE
QB64 Member Project - STxAxTIC 3D World