Topic: Problems making a CSV file

I need to build a CSV (comma separated values) file as one of the outputs from my fortran program.   Research on the web has indicated this is problematic in both formatted and format free output streams, and this has indeed turned out to be the case.   My data output line includes variable length strings, integers, and reals.  The problem is that it impossible to avoid generation of white space in the output.  I have found I can use, for instance, Notepad or OpenOffice.org spreadsheet to remove these spaces in a post-processing effort, and this solves the problem for the most part, although not as elegantly as I would prefer.  My issue is that I have one stubborn string, which still causes problems even after I have tried numerous variations in string length of the "A" edit descriptor.  When I look at the problematic string in the appropriate column in my spreadsheet program, I see that it looks like:

'2009-12-15

This string (a date representation) is preceded in the output stream by a character that displays like a single tic quote; however, I have not found a windows program which recognizes it as such in a search operation.   Notepad or other text processors do not display it at all.  Interestingly, all of the data fields that are generated in my csv output stream contain this character; the difference is that if there is white space following it, as in:

'   45.6709

then either notepad or openOffice.org spreadsheet will successfully remove both the white space and the odd character when I search (and replace with null) using a standard keyboard blank (ascii character 32).   The odd character is somehow associated/attached to the leading blank.  If there is not white space immediately following the odd character and before the beginning of the actual character string, then the search and replace is unsuccessful, and therein lies my problem. 

Does anyone have any thoughts about
1.  A way to suppress this character in the fortran output stream?
2.  What is the character; ie, its ascii or ansii code? Some sort of control character?
3.  Ways to replace it with a null?
4.  Other ideas?

thanks.....

Re: Problems making a CSV file

Write each line to a character string buffer.
Compact it.
Write the buffer to the output unit.

I have no idea where the "`" comes from.
---
John

Re: Problems making a CSV file

These two functions might come in handy.
The code is a bit 1977 but they do seem to work.
Sorry that I don't know how to display this in a monospaced font.


      CHARACTER*80 FUNCTION SKIPCHR(STRING,INDENT,NBGCHR)
C     ---------------------------------------------------
C     jw / 15-06-04  orig. draft commenced
C     jw / 21-11-04  last rev.

C     Scans STRING and returns it with all leading 'NBG' chars removed.
C     Returns the offset to the 1st 'good' character as INDENT
C     If all chars are NBG then returns SKIPCHR=' ' and INDENT=0.
C
      INCLUDE  'CFGTYP.INC'
      INCLUDE  'ASCTYP.INC'

      CHARACTER STRING*(*), NBGCHR*(*)
      INTEGER   INDENT,OKCHARS,NBG,NNBGCHR,LENGTH,LEN,YES

      DATA      YES / 1 /

      LENGTH  = LEN(STRING)
      NNBGCHR = LEN(NBGCHR)

      POSNBG  = 0
      OKCHARS = 0
      DO 10 INDENT = 1,LENGTH
         DO 10 NBG=1,NNBGCHR
10          IF (INDEX(NBGCHR(1:NNBGCHR),STRING(INDENT:INDENT)).EQ.0)
     +      GO TO 11
      SKIPCHR=' '
      INDENT =0
      RETURN

11    CONTINUE

      DO 12 I=1,LENGTH
12       SKIPCHR(I:I) = ' '
      SKIPCHR = STRING(INDENT:LENGTH)
     
      RETURN
      END




      CHARACTER*160 FUNCTION COMMAS(STRING)
C     -------------------------------------
C     jw / 21-11-04  orig. draft commenced
C     jw / 18-12-08  re-written
C     jw / 18-03-12  re-re-written
C     jw / 12-04-12  last rev.

C     Strips leading/training blanks.
C     Substitutes a single comma for each embedded string of a comma
C     and one or more blanks.
C     Strips trailing zeros directly in front of each comma
C------------------------------------------------------------------------/

      CHARACTER (LEN=*)    :: STRING
      LOGICAL SkipSPC
      INTEGER   LENGTH

      STRING = TRIM(STRING)

      LENGTH = LEN(STRING)
      COMMAS = TRIM(REPEAT(' ',160))


C     FIRST, SKIP LEADING BLANKS

      DO iPOS=1,LENGTH
         IF(STRING(iPOS:iPOS).NE.' ') EXIT
      END DO

C     NEXT, COPY STRING to COMMAS, SKIPPING BLANKS DIRECTLY AFTER COMMAS

      SkipSPC = .FALSE.
      iCOM=1

      DO iPOS=iPOS,LENGTH
         IF(SkipSPC .AND. STRING(iPOS:iPOS).EQ.' ') CYCLE
         ! Copy one character
         COMMAS(iCOM:iCOM)=STRING(iPOS:iPOS)
         ! Just copied a character, so don't skip any more spaces
         SkipSPC=.FALSE.
         iCOM=iCOM+1
         ! Keep doing so until a comma has been copied
         IF(STRING(iPOS:iPOS).NE.',')THEN
            CYCLE
         ELSE
            ! Just copied a comma, so add a space after it in COMMAS
            ! but skip spaces immed. after it in STRING
            SkipSPC = .TRUE.
            iCOM=iCOM+1
            COMMAS(iCOM:iCOM)=' '
            CYCLE
         END IF
         SkipSPC = .FALSE.
      END DO

      LENGTH = LEN_TRIM(COMMAS)

C     FINALLY, STRIP CONTIGUOUS TRAILING ZEROS IN REAL NUMBERS
C     IMMEDIATELY BEFORE COMMAS

      DO iPOS=LENGTH,1, -1
         ! Is this a comma?
         IF(COMMAS(iPOS:iPOS).EQ.',') THEN
            !Yes, it's a comma, so check the characters before it for
            !the nearest decimal point and the nearest comma. 
            !If a decimal is closer then number immediately before
            !Commas(iPOS:iPOS) has to be a real number, in which case,
            !strip trailing zeros
            NearestDecimal=INDEX(Commas(1:iPOS-1),'.',BACK=.TRUE.)
            NearestComma=INDEX(Commas(1:iPOS-1),',',BACK=.TRUE.)
            IF(NearestDecimal.GT.NearestComma) THEN
               iZER=iPOS
               LENGTH = LEN_TRIM(COMMAS)
               DO WHILE (COMMAS(iZER-2:iZER-1).EQ.'00')
                  iZER=iZER-1
               END DO
               IF(iZER.NE.iPOS)
     +         COMMAS(iZER:LENGTH)=
     +         COMMAS(iPOS:LENGTH)//repeat(' ',Ipos-Izer)
            END IF
         END IF
      END DO

      RETURN
      END

Re: Problems making a CSV file

John, thanks for the thoughts and the functions.  Since posting this yesterday, I have been trying various approaches, but I haven't found a way yet to make the control character go away on the fortran side.  Meanwhile, I have found that if I always make sure there is a leading space or two on any output string  then I can deal with the problem in the spreadsheet.  So, now I am not writing the original string to the output file, but rather a temp string which is a concatenation of two spaces and the string of interest.  That seems to resolve the problem of the spreadsheet program choking during the search and replace.

Going to hang with this work-around until I can determine what the character is and where it is coming from.....

thanks.....

Re: Problems making a CSV file

Could you possibly post your format string here?  Does the "`" character occur in columns other than just the first?  There is no reason for that character to be produced, so something is going wrong.

Jeff Armstrong
Approximatrix, LLC

Re: Problems making a CSV file

Jeff: 
Here are two examples of the problem.  The first is the metadata that I print to the top of each CSV file.
The second example is the printing of some actual data.




Example 1:
================================================
Character(Len=10) :: runDate

Character(Len=10) :: runTime
Character(len=90) :: projectPath
Real :: version
.
.
.
Write(18,*),  runDate

Write(18,*),  runTime

Write(18,*),  projectPath

Write(18,"(A11,F4.2)"), "Version: ", version



The output as seen in OpenOffice spreadsheet (I am replacing spaces with asterisks for clarity):


'*03-20-2013
'*15:46:39
*C:\Users\Dan\Documents\analyzer3\2013
**Version: 1.52


The odd character is only seen in the runDate and runTime output.
It is only displayed in the spreadsheet program, not in a text processor.

The odd character never shows in the path or version output in either the spreadsheet or the text processor, which
does suggest Fortran is variably transmitting the character, and it's not just a spreadsheet artifact.




Example 2:
========================================
Character(len=14) :: csvOutputDate
Integer :: d
Integer :: allValue
Integer :: startValue
Integer :: totalValue
Real :: percentTotalValueChge
Real :: percentOverallChge
Integer :: numberCurrent
.
.
.
csvOutputDate="    "//Trim(dateStringRev(d))
Write(18,*), d, ",", csvOutputDate, ",", allValue, ",", startValue, ",", totalValue, ",", &
             percentTotalValueChge, ",", percentOverallChge, ",", numberCurrent




1.) The output as seen in a text processor (again replacing each space with an asterisk. Also this is actually one line, not split into two as shown in Forum formatting):


**********30*,****2006-12-13,***********0*,******200000*,******200000*,***4.93873739*****,***5.197999995*****,***********4



2.) The output as seen in OpenOffice (spreadsheet column limits shown with a |. Also this is actually one line, not split into two as shown in Forum formatting):


'**********30*|'****2006-12-13|'***********0*|'******200000*|'******200000*|'***4.93873739*****|'***5.197999995*****|'***********4|





I have tried using formatted output in example 2....but the odd character appeared then as well, so I went back to
format-free to simplify.

If there is not a space between the odd character & the data, then search and replace in the spreadsheet is not successful.  If I cannot remove it, the end result is that the value in the spreadsheet is never recognized as a number and no calculations can be performed on that row & column.  This is why I have added 4 spaces to the front of the variable csvOutputDate.

In other words, the current work-around requires at least one leading space before each actual value.

I am using Version 3.4.1 of OpenOffice.

Thanks for looking at this....

Dan

Re: Problems making a CSV file

Hi -

My guess is that it has nothing to do with Fortran, etc. The "odd character" is coming entirely from the spreadsheet application. I am not a user of OpenOffice, but I use Excel, and I know that the single quote can be used at the start of text strings to force the entry to be text.

What is interesting is that, from your messages, the character is only added to the first two lines - runDate and runTime. However, note that those two lines, if not prefaced with the quote to denote them as text, would otherwise be interpreted as Formulas or some other format. For example, if I open Excel and type "1:2" (whithout the quotes), the cell shows "1:02" and if I click, the formula bar shows "1:02:00 AM". In other words, Excel understands "1:2" to be a Time entry. If I really want "1:2", then I must enter "'1:2" - that is, preface with a single quote. Likewise, if I enter "1-2-3", that gets read as a Date. To actually get 1-2-3, I must preface with a single quote. Your other two lines, projectPath and version, are unambiguous to the spreadsheet; they can only be understood as text, so a quote character does not get added.

So, I guess when you import the data into your spreadsheet, it adds the character (it is a single quote, the lower case found under the uppercase " on most keyboards) to "protect" your value as text - that is, protect your value from being understood to be a time, date, formula, etc.

Regards

Re: Problems making a CSV file

petesun -

I think your explanation is quite reasonable.  If the spreadsheet is adding the tic mark, then it does it not only to potential date and time data, but also to numeric data as well (see my Example2).  It apparently leaves alpha data alone, as I have never seen the tic added to a string which had no colons or hyphens.

It's interesting that the fix is to remove all spaces from the spreadsheet.  The logic escapes me, but if it works, it works.  The only thing to watch for is be sure to (in the fortran program) preface numeric, time, and date data with at least one space, or the removal process chokes.

Thanks for your thoughts....

Dan

Re: Problems making a CSV file

Hi -

It would seem Excel and OpenOffice don't quite behave exactly the same. For example, Excel does not add the single quote for numeric data - Excel reads it as a true numeric value, even if the .csv has it preceded with space characters.

Regards

10 (edited by JohnWasilewski 2013-03-22 19:36:36)

Re: Problems making a CSV file

Have you tried Softmaker Office?
I swear by it.  It outshines OpenOffice by a country mile.
Not free like OpenOffice (though I think Softmaker Office 2010 might be available free of charge) but I find it totally excellent.

Please God, ask Micro$oft to abandon the loathsome unintuitive vast time-wasting Office 2007 onwards interface and revert to ongoing development of  to Office 2003!

I know this is off-topic. 
Do remove it if you like, Jeff.
---
J.