Double quote csv elements in Golang

Thing you want to do

From General format of CSV file (RFC4180 Japanese translation)

Each field (in the record) may or may not be enclosed in double quotes.

So, depending on the project, let's enclose it in double quotes! I think that it may happen. It was the same for me.

So, as the title says, ** Golang will double quote each element of csv! ** That's what I want to do this time.

However, with golang's standard csv encoder, "" is always added as an escape character, so it becomes " column "" "" column "" ". What's wrong

Conclusion

Basically, use gocsv. Encode processing like this, the output will be like this.

client_id,client_name,client_age
12,John,21
13,Fred,
14,James,32
15,Danny,
Full code

base.go


package main

import (
	"fmt"
	"bytes"

	"github.com/gocarina/gocsv"
)

type Client struct { // Our example struct, you can use "-" to ignore a field
	Id      string `csv:"client_id"`
	Name    string `csv:"client_name"`
	Age     string `csv:"client_age"`
	NotUsed string `csv:"-"`
}

//base data
func basedata() []*Client {
	clients := []*Client{}
	clients = append(clients, &Client{Id: "12", Name: "John", Age: "21"}) // Add clients
	clients = append(clients, &Client{Id: "13", Name: "Fred"})
	clients = append(clients, &Client{Id: "14", Name: "James", Age: "32"})
	clients = append(clients, &Client{Id: "15", Name: "Danny"})
	return clients
}

func main() {
	clients := basedata()
	out := bytes.Buffer{}
	err := gocsv.Marshal(&clients, &out) // Get all clients as CSV string
	if err != nil {
		panic(err)
	}
	csvContent := out.Bytes()
	fmt.Println(string(csvContent)) // Display all clients as CSV string

}

No matter how much you attach " to each element before encode, it will be changed to "" " by Marshal, so change your mindset. Once you encode the byte string, you can get it as a string for each element, so add " to it.

Since gocsv can control the delimiter and line feed code, it is taken from there.

By default, it is the value of gocsv.SafeCSVWriter that can be obtained by executing gocsv.DefaultCSVWriter, so ,. The line feed code is \ n.

This information may change with gocsv.SetCSVWriter, so please change it according to your implementation.

The sample is here, and the processing is excerpted. Gist

base.go


//Pass the second argument of Marshal as it is
func convert(b *bytes.Buffer) ([]byte, error) {
        //retry to decode, to check ""
        reader := csv.NewReader(b)
        reader.LazyQuotes = true
        lines, err := reader.ReadAll()
        if err != nil {
                return []byte{}, err
        }

        //rewrite to add "", escape \"
        bytes := make([]byte, 0, len(b.Bytes())*2)

        //If you update writer by SetCSVWriter, please change the delimiter which you use
        delimiter := ','
        //If you update writer by SetCSVWriter, please change the crlf which you use
        for _, line := range lines {
                for i, part := range line {
                        if i != 0 {
                                bytes = append(bytes, byte(delimiter))
                        }
                        bytes = append(bytes, []byte(escape(part))...)
                }
                bytes = append(bytes, byte('\r'))
        }
        return bytes, nil
}

func escape(part string) string {
        //"XXX" => XXX
        escapeStr := strings.Replace(part, "\"", "\"\"", -1)
        return "\"" + escapeStr + "\""
}

Execution result. If you want to keep the tag as it is, you can adjust the lines loop in convert and skip the beginning.

"client_id","client_name","client_age"
"12","John","21"
"13","Fred",""
"14","James","32"
"15","Danny",""
Bonus
## Bonus Background to the conclusion

Enjoy the uncle's Saturday night struggling to implement csv in golang

Trigger: There was a story saying "I can't handle double quotes in csv!", So I investigated it.

Using the reference article as a hint, I wonder if it would be a good idea to wrap it somewhere in the encoder. Below is the flow of the mud survey

--Can I use Set CSVWriter in gocsv to overwrite csv.Writer in io.Writer and ʻencoding / csv? --Since io.Writer controls the byte string, making your own here is no different from making your own csv parser. Rejected --Since the encoding / csv side exchanges the structure defined internally instead of the interface, it is impossible to overwrite due to the specifications of golang. -→ This idea is useless! ――On the contrary, can you change the writer used by gocsv and execute various processes to add " ? --Gocsv defines an interface called CSVWriter! It looks like it can be used! ――No, it is possible to register the writer generation function with SetCSVWriter, but the argument of the function is * SafeCSVWriter instead of CSVWriter. The mold does not fit. .. .. ――But as for how to use gocsv, I'm not using SafeCSVWriter other than the interface of CSVWriter! Does it work even if the definition is changed to CSVWriter? -[fork](https://github.com/developer-kikikaikai/gocsv/tree/abstruct_csvwriter) and try overwriting CSVWriter. Oh, I can go! ――Please pack the sample code of double quotation marks! It would be great if you could officially incorporate it, well. Do you want to write a test in an actual case by moving your own CSVWriter? ――Wow, you've escaped to the double quotes you set! ――It looks good so far, but it seems that it cannot be solved --The decode of gocsv is good at handling "You can get data for each element! --Then, if you take out the double quotation mark code you did above in column units, you can get the characters without escaping! Shouldn't we use this to write to [] byteand rebuild it? ――Yeah, you can. The "escape" in the string is also missing, so let's add it! --OK! It went well! After a break, put out a PR on gocsv and it's a solution! -... that, isn't it meaningless to play withCSVWriter`?

It was Saturday Night Fever who debugged and fixed the official code that ended up in vain.

I was wondering if I would like to publicize the traces of my efforts, but I thought it was not worth it because the number of use cases became zero, so I put it on hold. I ended up writing down this article and Gist instead

reference

General format of CSV file (RFC4180 Japanese translation) 4.1.6 CSV format description rules When outputting CSV with golang, elements cannot be enclosed in double quotes gocsv official

Recommended Posts