Text Import Wizard could add space as a delimiter

RAmeeti · May 17, 2017, 6:11am

I have a text file that has its ‘fields’ demarated with spaces (not tabs, commas, etc.). The Import Wizard offered no assistance but suggested all text for Column 1. Rather than offer nothing, perhaps a last resort might be to separate on spaces. This would also allow for the combination of ‘fields’ when necessary for those fields that are multiple ‘words’.

admin · May 17, 2017, 7:24pm

The text import wizard doesn’t have explicit delimiters. It scans the imported text, if it finds tabs then it automatically sets the delimiter to tabs, otherwise it uses commas. There is no manual override.

If you do have a rare situation where you need to use some other separator character, I would recommend writing a short program using the importtext statement, which does allow the delimiter character to be set explicitly.

RAmeeti · May 17, 2017, 7:38pm

I fully understand that it uses Tabs if available, then it uses Commas. My point is that presently if it finds neither tabs, nor commas, then it does nothing. My request is rather than it doing nothing (which is unlikely that that would be the choice of a user), that it then lastly default to spaces. Doing nothing is rarely the right choice.

admin · May 17, 2017, 7:59pm

We’ll have to agree to disagree on this. I think that imported data with a single column of data is a much more likely scenario than data that is consistently separated by spaces. In other words, I think that in this case, “doing nothing” is more likely to be the right choice.

In the rare situation where you have a data file that is separated by spaces, you could write a one line program to import the text. Or, you could just import it and use the Split Field tool afterwards.

KJM · May 17, 2017, 8:05pm

Robert, is your text data file really separated with single space characters? This would be absolutely non-standard. But as Jim said, you could create your own importtext procedure with a formula looking for space characters as separators.

But I remember — years or decades ago — certain text files where the data were arranged by field length, and if the data length was 10 characters and the field length was 20 characters, then the data in the field was filled up with 10 space characters.

In those “field length separated” text files the only thing that helped parsing the data was knowing the structure (the defined field lengths) and using text funnels.

michael · May 17, 2017, 11:24pm

I strongly endorse Jim’s view. I often have occasion to import a large block of text into Panorama and then process it. Having every word in a separate field would be a disaster.

RAmeeti · May 18, 2017, 12:04am

Doing a File/Open will open your large block of text file without any problem. There is no need to use the Text Import Wizard if you do not want to do any manipulations on your text. Just open the file.

I do enough text import that I do deal with fixed length fields, character delimited fields, etc, but what I was attempting to convey is that a user would only be using the Text Import Wizard if they needed or wanted the Wizard to be doing something for them. If they did not want or need anything done, don’t use the Wizard. If the Wizard has been chosen to do some ‘wizardry’, then perhaps it should do something. I don’t typically go to the wizard when I want nothing done.

RAmeeti · May 18, 2017, 1:05am

What is non standard to one person, is very common to another. Would you consider Apache Server log files to be non-standard? There are quite a few of them out there and the file formats for these files were designed by geeks, not by our grandmothers. Parsing them on spaces would probably be the most efficient way to start. While it may not provide the perfect end result, the initial parsing on spaces could be all that is needed.

https://httpd.apache.org/docs/1.3/logs.html There is no tab, no comma, no fixed length. Just spaces.

This is certainly not the first time that I have had a space delimited file.

KJM · May 18, 2017, 6:11am

Robert, I made a little test database and set up a procedure for parsing the log. The database has a leftmost field “Log” for the import and then some fields according to the Apache documentation. Here is my procedure that distributes the log into the fields:

Field Host
Formulafill firstword(Log)
Field Client
Formulafill nthword(Log,2)
Field User
Formulafill nthword(Log,3)
Field Time
Formulafill nthword(Log,4)+" "+nthword(Log,5)
Field Request 
Formulafill Log['"',-1][2,-1][1,'"'][1,-2]
Field Status
Formulafill nthword(Log,-2)
Field Size
Formulafill nthword(Log,-1)

RAmeeti · May 18, 2017, 3:52pm

Thank you for your procedure.

Unfortunately the spec shared shows that the parameters are all optional and there is no mandated order. Each parameter may or may not appear in a log file and they may appear in any order. This particular spec was offered to show that space delimited files are not ‘absolutely non-standard’. A space delimited wizard would assist users who deal with these kinds of file formats. A simple space delimited import would give me (and many others) enough to accomplish our needs without needing the expertise of being able to write a procedure.

KJM · May 18, 2017, 5:10pm

My code applies to what the documentation refers to as “Common Log Format”.

If you simply want to import a text file with data separated by spaces, then use code like this:

importtext sometext,"Separator"," "

There are many more import options e.g. “REPLACE”, “REARRANGE”.