• Print

Author Topic: Grab Range in Open File  (Read 141 times)

pitt

  • Full Member
  • ***
  • Posts: 243
  • Where's Timmy?
    • Email
Grab Range in Open File
« on: October 25, 2012, 12:50:34 AM »
I am currently storing my MAP files in a single PKG file I create by reading each file using BINARY. To compile the PKG I read each files size, write a header then the data. It's pretty slow as an average map contains text data and images; so around 2-6MB per PKG. When I extract the files it is also slow.

I think if there was a way to grab the entire contents of a file in the PKG in one quick grab into an array then I could simply put the entire array back as a file. So if I could do something like ...

Code: [Select]
GET #1, a to b, file&&
PUT #2, , file&&

That would rock.

Another thought was to load the entire file into RAM then grab the file data from there and write it to disk in one swap, but I don't see how I would go about that.

Here is the code I have now (so far). It changes to a sub-dir named "test" and compiles the files name "map" + map$ "*.*".

So;

test\map01001.map
test\map01001.crd
test\map01001.spr
test\map01001a.png
test\map01001b.png
test\map01001c.png
test\map01001d.png

Then it will extract the files to \extracted

Code: [Select]
SCREEN 0
_SCREENMOVE _MIDDLE
DIM f$(500), f&(5000)
DIM SHARED splitstring$(5000) ' SplitString Array

PRINT "MAP Compiler v0.1"

INPUT "MAP#", map$

PRINT
PRINT "Scanning Directory"
PRINT
CHDIR "test"
SHELL _HIDE "del map" + map$ + ".pkg"
SHELL _HIDE "dir map" + map$ + "*.* /o:n /b > temp.dat"
OPEN "temp.dat" FOR INPUT AS #1
IF EOF(1) = -1 THEN PRINT "MAP NOT FOUND": END

b = 0
DO
    b = b + 1
    INPUT #1, f$(b)
    IF RIGHT$(f$(b), 4) = ".psd" THEN f$(b) = "": b = b - 1 ' Excluded Files
    IF RIGHT$(f$(b), 9) = "_undo.crd" THEN f$(b) = "": b = b - 1 ' Excluded Files
LOOP UNTIL EOF(1) = -1
CLOSE

PRINT "FOUND"; b; "Files"
PRINT

SLEEP 1

b = 0
DO
    b = b + 1
    IF f$(b) = "" THEN EXIT DO
    OPEN f$(b) FOR INPUT AS #1: f&(b) = LOF(1): CLOSE #1
    PRINT f$(b); " -"; f&(b); "bytes"
    header$ = header$ + f$(b) + "," + RIGHT$(STR$((f&(b))), LEN(STR$(f&(b))) - 1) + ","
LOOP

header$ = header$ + "!" ' Terminator

PRINT
PRINT "HEADER: LENGTH - " + STR$(LEN(header$))
PRINT
PRINT header$
PRINT ""
PRINT "WRITING HEADER ..."

OPEN "map" + map$ + ".pkg" FOR BINARY AS #1
a = 0
DO
    a = a + 1
    z$ = RIGHT$(LEFT$(header$, a), 1)
    PUT 1, a, z$
LOOP UNTIL a = LEN(header$)

b = 0
DO
    b = b + 1
    IF f$(b) = "" THEN EXIT DO
    PRINT "WRITING " + f$(b)
    OPEN f$(b) FOR BINARY AS #2
    c = 0
    DO
        c = c + 1
        a = a + 1
        GET #2, c, z$
        PUT 1, a, z$
    LOOP UNTIL c = f&(b)
    CLOSE #2
LOOP

CLOSE #1

PRINT "TOTAL BYTES"; a

SLEEP 1

PRINT
PRINT "Extracting Files ...."
PRINT
PRINT "Reading Header ...."
PRINT

OPEN "map" + map$ + ".pkg" FOR BINARY AS #1
a = 0
DO
    a = a + 1
    SEEK 1, a:
    GET 1, a, z%
    x$ = _MK$(INTEGER, z%)
    nheader$ = nheader$ + LEFT$(x$, 1)
LOOP UNTIL LEFT$(x$, 1) = "!"

PRINT "HEADER: LENGTH - " + STR$(LEN(nheader$))
PRINT
PRINT "HEADER: "; nheader$
PRINT

SHELL _HIDE "del extracted\*.* /y"

prev = a ' Previous Position Value

CALL split_string(nheader$, ",")

b = -1: g = 0
DO
    b = b + 2
    IF splitstring$(b) = "!" OR splitstring$(b) = "" THEN EXIT DO
    f$(b) = splitstring$(b): PRINT "b: "; b; splitstring$(b)
    f&(b) = VAL(splitstring$(b + 1)): PRINT "b"; b; VAL(splitstring$(b + 1))

    OPEN "extracted\" + f$(b) FOR BINARY AS #2
    c = 0
    DO
        a = a + 1: c = c + 1
        SEEK 1, a: GET 1, a, z%
        x$ = _MK$(INTEGER, z%)
        t$ = LEFT$(x$, 1)
        PUT 2, c, t$
    LOOP UNTIL c = f&(b)
    prev = a
    CLOSE #2

LOOP

END

'--------------------------------------------------------------------------------

SUB split_string (str1ng$, seperator$)
ERASE splitstring$
z$ = str1ng$: w$ = seperator$
a = 0: b = 0: x = 0: y$ = "": z = 0
IF w$ = "" THEN w$ = ";"
IF z$ = "" THEN
ELSE
    IF LEN(z$) = 0 OR LEN(z$) = 1 THEN
    ELSE
        z$ = w$ + z$ + w$
        DO
            a = a + 1
            x$ = RIGHT$(LEFT$(z$, a), 1)
            IF x$ = w$ THEN b = 1: y$ = ""
            WHILE b = 1
                a = a + 1
                x$ = RIGHT$(LEFT$(z$, a), 1)
                IF x$ = w$ THEN
                    a = a - 1
                    x = x + 1 ' Count Up in the Array
                    splitstring$(x) = y$ ' Add Value to the Array
                    b = 0
                ELSE
                    y$ = y$ + x$
                END IF
                IF a = LEN(z$) THEN EXIT WHILE
            WEND
        LOOP UNTIL z = 1 OR a = LEN(z$)
    END IF
END IF
END SUB

Where's Timmy?

SMcNeill

  • Hero Member
  • *****
  • Posts: 2436
    • Email
Re: Grab Range in Open File
« Reply #1 on: October 25, 2012, 03:50:01 AM »
Quote from: pitt on October 25, 2012, 12:50:34 AM
I am currently storing my MAP files in a single PKG file I create by reading each file using BINARY. To compile the PKG I read each files size, write a header then the data. It's pretty slow as an average map contains text data and images; so around 2-6MB per PKG. When I extract the files it is also slow.

I think if there was a way to grab the entire contents of a file in the PKG in one quick grab into an array then I could simply put the entire array back as a file. So if I could do something like ...

Code: [Select]
GET #1, a to b, file&&
PUT #2, , file&&

I think what you're looking for is something simple like this:
Code: [Select]
dim temp as string
temp = space$(b - a + 1)
GET #1, a, temp
PUT #2, , temp


Question: 

Why is it DIM f$(500), f&(5000)?   f$ is the file name, and f& is the length of the file.   Why 5000 file lengths for 500 files?
« Last Edit: October 25, 2012, 04:07:03 AM by SMcNeill »
http://bit.ly/TextImage -- Library of QB64 code to manipulate text and images, as a BM library.
http://bit.ly/Color32 -- A set of color CONST for use in 32 bit mode, as a BI library.

http://bit.ly/DataToDrive - A set of routines to quickly and easily get data to and from the disk.  BI and BM files

SMcNeill

  • Hero Member
  • *****
  • Posts: 2436
    • Email
Re: Grab Range in Open File
« Reply #2 on: October 25, 2012, 04:30:50 AM »
A few things you might consider:

Instead of making one long header and stacking the file name and size in it all together, make the header into a separate index file.

You have:
Code: [Select]
b = 0
DO
    b = b + 1
    IF f$(b) = "" THEN EXIT DO
    OPEN f$(b) FOR INPUT AS #1: f&(b) = LOF(1): CLOSE #1
    PRINT f$(b); " -"; f&(b); "bytes"
    header$ = header$ + f$(b) + "," + RIGHT$(STR$((f&(b))), LEN(STR$(f&(b))) - 1) + ","
LOOP

header$ = header$ + "!" ' Terminator

PRINT
PRINT "HEADER: LENGTH - " + STR$(LEN(header$))
PRINT
PRINT header$
PRINT ""
PRINT "WRITING HEADER ..."

OPEN "map" + map$ + ".pkg" FOR BINARY AS #1
a = 0
DO
    a = a + 1
    z$ = RIGHT$(LEFT$(header$, a), 1)
    PUT 1, a, z$
LOOP UNTIL a = LEN(header$)

Would something like this work better for you?
Code: [Select]
TYPE fileinfo
    name AS STRING * 255
    length AS LONG
END TYPE

DIM file AS fileinfo


b = 0
OPEN "map" + map$ + ".idx" FOR BINARY AS #2
DO
    b = b + 1
    IF f$(b) = "" THEN EXIT DO
    OPEN f$(b) FOR INPUT AS #1: f&(b) = LOF(1): CLOSE #1
    PRINT f$(b); " -"; f&(b); "bytes"
    file.name = f$(b)
    file.length = f&(b)
    PUT #2, , file
LOOP
CLOSE #2

Notice, here I'm going to combine the name and length into one custom type for quick retrieval and writing.  To get your file data back, open for binary as #2 and then:
GET #2, filenumber * 277 +1, file   (Note this assumes we start at filenumber 0).    So if you need the 10th map, set filenumber to 10 and get it.  One quick call and you have the name and length all at once, and you don't need to break down the header to get what you want.  ;)

*************

If you do stick with the method you're using, why write only 1 byte at a time??

Code: [Select]
DO
    a = a + 1
    z$ = RIGHT$(LEFT$(header$, a), 1)
    PUT 1, a, z$
LOOP UNTIL a = LEN(header$)

Why not PUT 1,  , header$ and be done with it?

And here:

Code: [Select]
b = 0
DO
    b = b + 1
    IF f$(b) = "" THEN EXIT DO
    PRINT "WRITING " + f$(b)
    OPEN f$(b) FOR BINARY AS #2
    c = 0
    DO
        c = c + 1
        a = a + 1
        GET #2, c, z$
        PUT 1, a, z$
    LOOP UNTIL c = f&(b)
    CLOSE #2
LOOP

Try it with one run as well:
Code: [Select]
b = 0
DO
    b = b + 1
    IF f$(b) = "" THEN EXIT DO
    PRINT "WRITING " + f$(b)
    OPEN f$(b) FOR BINARY AS #2
    z$ = SPACE$(f&(b))
    GET #2, 1, z$
    PUT #1, , z$
    CLOSE #2
LOOP

One of the biggest advantages of binary is reading and writing data in chunks instead of in bytes.  I think you'll be pleasantly surprised at how much faster it can get.   :)
http://bit.ly/TextImage -- Library of QB64 code to manipulate text and images, as a BM library.
http://bit.ly/Color32 -- A set of color CONST for use in 32 bit mode, as a BI library.

http://bit.ly/DataToDrive - A set of routines to quickly and easily get data to and from the disk.  BI and BM files

pitt

  • Full Member
  • ***
  • Posts: 243
  • Where's Timmy?
    • Email
Re: Grab Range in Open File
« Reply #3 on: October 25, 2012, 05:51:10 AM »
Quote from: SMcNeill on October 25, 2012, 03:50:01 AM

Question: 

Why is it DIM f$(500), f&(5000)?   f$ is the file name, and f& is the length of the file.   Why 5000 file lengths for 500 files?

Well, I apparently do not understand how or what the DIM amounts do, something I had been meaning to research.

Say I have an array called max_hp() for 20 characters. If I simple do "DIM max_hp()" and then try to fill 20 (max entries) "max_hp(1) through max_hp(20)" I get an error. So I then do "DIM max_hp(20)", but the HP MAX is 9,999,999 ..... So when each of the 20 array entries get larger I get another error (I thought 20 would mean 20 entires?). So then I increased the "DIM max_hp(20)" to "DIM max_hp(500)" and then I can assign all 20 values to 9,999,999 with no problem. With the staggering amount variables I use I could not find the time to reduce the 500 while each 20 array entires is set to 9999999 to discover the lowest value. I thought that simply DIM'ing the variable at 20 would cover the 20 entries, but it doesn't. It seems to be based on the total size of all of the array entries. But, if that were the case then I would have to DIM the array at 20*9999999, but 500 works, and 20 or even 100 doesn't. Needless to say I'm confused - so for a temporary solution I DIM'ed all my variables at 500 and ones that will store massive amounts of data to 5000 (including single variables and text arrays/variables). The same thing happens if I do a DIM names$(20) .... I can't fill each 20 entires with the names before QB64 gives me an error and again if I increase it to 500 then I have no issues.

That's why it makes no sense because I don't really understand how to calculate on how much to DIM array or variable.
Where's Timmy?

pitt

  • Full Member
  • ***
  • Posts: 243
  • Where's Timmy?
    • Email
Re: Grab Range in Open File
« Reply #4 on: October 25, 2012, 06:13:29 AM »
Quote from: SMcNeill on October 25, 2012, 04:30:50 AM
Try it with one run as well:
Code: [Select]
b = 0
DO
    b = b + 1
    IF f$(b) = "" THEN EXIT DO
    PRINT "WRITING " + f$(b)
    OPEN f$(b) FOR BINARY AS #2
    z$ = SPACE$(f&(b))
    GET #2, 1, z$
    PUT #1, , z$
    CLOSE #2
LOOP

One of the biggest advantages of binary is reading and writing data in chunks instead of in bytes.  I think you'll be pleasantly surprised at how much faster it can get.   :)

Thanks, but I do not need to extract single files from the PKG. Maps are extracted, loaded, used ,deleted. The write speed is almost instant now when building the PKG, but I can't seem to duplicate the extraction process. Might you post how to extract the included files in chunks using the code I posted? Thanks!
Where's Timmy?

SMcNeill

  • Hero Member
  • *****
  • Posts: 2436
    • Email
Re: Grab Range in Open File
« Reply #5 on: October 25, 2012, 06:14:23 AM »
Dim the array for the number of elements in the array AS the variable type needed for those elements.

Code: [Select]
DIM max_hp(20)

FOR i = 1 TO 20
    max_hp(i) = 9999999
NEXT

FOR i = 1 TO 20
    PRINT "Character #"; i, "Max HP:"; max_hp(i)
NEXT

This works as we're defining our array to contain 20 elements of SINGLE(default) value.  If we tried to define it as _BYTE or _INTEGER, it wouldn't work as they won't hold a number that size.  We could define it as LONG if we want though, and keep it an integer.

Code: [Select]
DIM max_hp(20) AS LONG

FOR i = 1 TO 20
    max_hp(i) = 9999999
NEXT

FOR i = 1 TO 20
    PRINT "Character #"; i, "Max HP:"; max_hp(i)
NEXT

Same way with names.

DIM charactername(20) as STRING * 100

The above would let us have an array with 20 names, and each name could be up to 100 letters in length.
http://bit.ly/TextImage -- Library of QB64 code to manipulate text and images, as a BM library.
http://bit.ly/Color32 -- A set of color CONST for use in 32 bit mode, as a BI library.

http://bit.ly/DataToDrive - A set of routines to quickly and easily get data to and from the disk.  BI and BM files

SMcNeill

  • Hero Member
  • *****
  • Posts: 2436
    • Email
Re: Grab Range in Open File
« Reply #6 on: October 25, 2012, 06:17:26 AM »
Quote from: pitt on October 25, 2012, 06:13:29 AM
Quote from: SMcNeill on October 25, 2012, 04:30:50 AM
Try it with one run as well:
Code: [Select]
b = 0
DO
    b = b + 1
    IF f$(b) = "" THEN EXIT DO
    PRINT "WRITING " + f$(b)
    OPEN f$(b) FOR BINARY AS #2
    z$ = SPACE$(f&(b))
    GET #2, 1, z$
    PUT #1, , z$
    CLOSE #2
LOOP

One of the biggest advantages of binary is reading and writing data in chunks instead of in bytes.  I think you'll be pleasantly surprised at how much faster it can get.   :)

Thanks, but I do not need to extract single files from the PKG. Maps are extracted, loaded, used ,deleted. The write speed is almost instant now when building the PKG, but I can't seem to duplicate the extraction process. Might you post how to extract the included files in chunks using the code I posted? Thanks!

What you have there would read and write it all at once.  :)

It sets z$ to hold a number of spaces equal to your file length.
It then gets all of z$ at once, and writes all of z$ at once.

To get a string with binary, we need to know the length of that string.  If you do, then set your $ to that length with blank spaces, and then just read the whole thing at once. 
http://bit.ly/TextImage -- Library of QB64 code to manipulate text and images, as a BM library.
http://bit.ly/Color32 -- A set of color CONST for use in 32 bit mode, as a BI library.

http://bit.ly/DataToDrive - A set of routines to quickly and easily get data to and from the disk.  BI and BM files

SMcNeill

  • Hero Member
  • *****
  • Posts: 2436
    • Email
Re: Grab Range in Open File
« Reply #7 on: October 25, 2012, 06:34:29 AM »
NM - that's the READ routine.   For the WRITE routine, I think you'd want it like this:
Code: [Select]
CALL split_string(nheader$, ",")

b = -1: g = 0
a = 1
DO
    b = b + 2
    IF splitstring$(b) = "!" OR splitstring$(b) = "" THEN EXIT DO
    f$(b) = splitstring$(b): PRINT "b: "; b; splitstring$(b)
    f&(b) = VAL(splitstring$(b + 1)): PRINT "b"; b; VAL(splitstring$(b + 1))
    OPEN "extracted\" + f$(b) FOR BINARY AS #2
    temp$ = SPACE$(f&(b))
    GET 1, a, temp$
    PUT 2, , temp$
    a = a + f&(b)
    CLOSE #2
LOOP

Note:  I guarantee nothing, as I don't have the original files to actually work with to read and write, but I think this is what you're looking for.   Read the whole thing at once and write it, and skip the part where you read it all once byte at a time.  :)

Question:  Why read it as an integer, convert it to a string, and then write it as a character?

Code: [Select]
    DO
        a = a + 1: c = c + 1
        SEEK 1, a: GET 1, a, z%
        x$ = _MK$(INTEGER, z%)
        t$ = LEFT$(x$, 1)
        PUT 2, c, t$
    LOOP UNTIL c = f&(b)

Why not read it as a _BYTE and put it as a _BYTE without needing to convert?  I think it's the conversion back and forth that is really slowing you down as much as anything.

Make this simple change before you do all the rest and see how this affects performance:

Code: [Select]
    DO
        a = a + 1: c = c + 1
        SEEK 1, a: GET 1, a, z%%
        PUT 2, c, z%%
    LOOP UNTIL c = f&(b)

Read a byte, write a byte, and skip the conversion process.

*************

Sidenote:  You don't need to SEEK with Binary, if you use the middle argument to define where you're going to be getting data from either.
        SEEK 1, a   ---   This tells us to prepare to get information from openfile 1, at position a.
        GET 1, a, z%%   --- This tells us to get z%% from file 1, starting at position a.

You could leave the SEEK out and it'd work just as well, OR you could change your GET 1, a, z%% to GET 1, , z%% to get from file 1 at the current spot which the seek defined earlier.  There's no real reason to do both, and if you're looking for speed optimization, I'd just go ahead and remove the SEEK.

One fewer command to execute is one less drag on the process over all.  :)
« Last Edit: October 25, 2012, 07:14:08 AM by SMcNeill »
http://bit.ly/TextImage -- Library of QB64 code to manipulate text and images, as a BM library.
http://bit.ly/Color32 -- A set of color CONST for use in 32 bit mode, as a BI library.

http://bit.ly/DataToDrive - A set of routines to quickly and easily get data to and from the disk.  BI and BM files

pitt

  • Full Member
  • ***
  • Posts: 243
  • Where's Timmy?
    • Email
Re: Grab Range in Open File
« Reply #8 on: October 25, 2012, 07:03:58 AM »
Thanks, but I got it to work via the following:

Code: [Select]
b = -1: g = 0
DO
    b = b + 2
    IF splitstring$(b) = "!" OR splitstring$(b) = "" THEN EXIT DO
    f$(b) = splitstring$(b): PRINT "b: "; b; splitstring$(b)
    f&(b) = VAL(splitstring$(b + 1)): PRINT "b"; b; VAL(splitstring$(b + 1))

    OPEN "extracted\" + f$(b) FOR BINARY AS #2
    z$ = SPACE$(f&(b))
    GET #1, prev + 1, z$
    PUT #2, , z$
    CLOSE #2

    prev = prev + f&(b)

LOOP

SYSTEM
Where's Timmy?

SMcNeill

  • Hero Member
  • *****
  • Posts: 2436
    • Email
Re: Grab Range in Open File
« Reply #9 on: October 25, 2012, 07:14:42 AM »
Glad it's working for you now.  :)
http://bit.ly/TextImage -- Library of QB64 code to manipulate text and images, as a BM library.
http://bit.ly/Color32 -- A set of color CONST for use in 32 bit mode, as a BI library.

http://bit.ly/DataToDrive - A set of routines to quickly and easily get data to and from the disk.  BI and BM files

pitt

  • Full Member
  • ***
  • Posts: 243
  • Where's Timmy?
    • Email
Re: Grab Range in Open File
« Reply #10 on: October 25, 2012, 07:21:15 AM »
Quote from: SMcNeill on October 25, 2012, 07:14:42 AM
Glad it's working for you now.  :)

Yeah, me too. Extraction that used to take around 45 seconds now takes 2! Thanks!
Where's Timmy?

  • Print