diff --git a/README.md b/README.md index 8558ee2..6e4c74c 100644 --- a/README.md +++ b/README.md @@ -423,6 +423,22 @@ You can handle exotic encodings with the `encoding` option. ```ruby ImportUserCSV.new(content: "メール,氏名".encode('SJIS'), encoding: 'SJIS:UTF-8') ``` +### Custom cell sanitising + +By default cell values are stripped, that means " Bob Elvis " become "Bob Elvis", however you may want to go further and get rid of the inner extra spaces that often present on cell value and get the proper "Bob Elvis" + +To do that, add an initialiser `config/initializers/csv_importer.rb` and define you proper rules. + +``` +require 'csv_importer' +module CSVImporter + def self.sanitize_cell(raw_value) + raw_value.to_s + .gsub(/[\b\s\u00A0]+/, ' ') # Normalize white space + .strip # Remove trailing white space + end +end +``` ## Development diff --git a/lib/csv_importer.rb b/lib/csv_importer.rb index 13cb062..76cd1d2 100644 --- a/lib/csv_importer.rb +++ b/lib/csv_importer.rb @@ -30,6 +30,11 @@ module CSVImporter class Error < StandardError; end + def self.sanitize_cell(raw_value) + raw_value.to_s + .strip # Remove trailing white space + end + # Setup DSL and config object def self.included(klass) klass.extend(Dsl) diff --git a/lib/csv_importer/csv_reader.rb b/lib/csv_importer/csv_reader.rb index 0673f58..2a81232 100644 --- a/lib/csv_importer/csv_reader.rb +++ b/lib/csv_importer/csv_reader.rb @@ -60,7 +60,7 @@ def detect_separator(csv_content) def sanitize_cells(rows) rows.map do |cells| cells.map do |cell| - cell ? cell.strip : "" + cell ? CSVImporter.sanitize_cell(cell) : "" end end end