Friday, May 18, 2007

Secretes of ENVI_READ_COLS

I have been wandering why ENVI_READ_COLS read things so fate since I use IDL function txt_readall. I figured out the reason this evening.

The method used by ENVI_READ_COLS is:
1). Get the number of lines of the ASCII matrix;
2). Get the number of columns of the ASCII data matrix;
3). Read in the data by using "readf" procedure.

That's it.

Thus, the key to the speed lies in--How to get the number of lines fastly? This is the task of ENVI_NUM_ASCII_LINES. Its syntax is:
num_lines = ENVI_NUM_ASCII_LINES(file)

It seems that ENVI_NUM_ASCII_LINES use BYTE mode to search for Return and Carriage It read 10 lines each time.. I still didn't knnow a way to achieve this.

The target of mine is to write an IDL procedure to do those things.

2 comments:

Enod said...

function txt_lines, file, num_cols=num_cols
if n_params() lt 1 then begin
file='N.txt'
endif
openr,fid,file,/get_lun
i=0
tmp=''
while not eof(fid) do begin
readf,fid,tmp
i=i+1
endwhile

num_cols=n_elements(strsplit(tmp,/extract))
return,i
end

Enod said...

pro data_readcols, file, data=data, headers=headers, skip=skip, column_skip=column_skip
if n_params() lt 1 then begin
file='
skip=1
data=dblarr(649,649)
data=strarr(650,649)
column_skip=1
endif

num_lines=txt_lines(file, num_cols=ns)

openr, fid, file, /get_lun
tmp=''
if n_elements(skip) ne 0 then begin
for i=0, skip-1 do begin
readf,fid,tmp
if i eq 0 then begin
headers=tmp
endif else begin
headers=[headers,tmp]
endelse
endfor
endif

if n_elements(column_skip) ne 0 then begin
tmpdata=strarr(num_lines-skip)
readf, fid, tmpdata
data=dblarr(ns-column_skip, num_lines-skip)
for li=0, num_lines-skip-1 do begin
tmp=strsplit(tmpdata[li],/extract)
data[*,li]=double(tmp[column_skip:*])
endfor

endif else begin
data=dblarr(ns, num_lines-skip)
readf, fid, data
endelse

free_lun, fid

;help, data, headers
end