Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
46 commits
Select commit Hold shift + click to select a range
ae5bba9
[+FEAT] Implement Seekable Iterator
paales Sep 23, 2013
8ec9ce2
Fixed LibreOffice date import
karlis-i Mar 18, 2016
3a844ec
Fixed LibreOffice date import
karlis-i Mar 18, 2016
9a03959
Merge pull request #1 from karlis-i/karlis-i-patch-1
karlis-i Mar 18, 2016
237ad9e
Fixed LibreOffice date import
karlis-i Mar 18, 2016
f48d859
XLS date fix (proper commit)
karlis-i Mar 18, 2016
adc4f60
LibreOffice date pattern fix
karlis-i Mar 18, 2016
6350aeb
Fixed LibreOffice date import
karlis-i Mar 18, 2016
1e9b801
SAF-58: Fixed bug where XLSX reader could not be found
ChrisJasperse Apr 1, 2016
ee26fe8
SAF-58: Unlimited shared string cache. Fixed bug where first XSLX wor…
ChrisJasperse Apr 1, 2016
7bc115e
Fixed bug where first XSLX worksheet was inaccessible (again)
ChrisJasperse Apr 1, 2016
9185a3d
Excel's "General" format (v.0.5.7)
pilsetnieks Jan 31, 2015
8424977
Excel's "General" format (v.0.5.7)
pilsetnieks Jan 31, 2015
3bee7dd
v.0.5.8.
pilsetnieks Jan 31, 2015
e1b2e40
v.0.5.8.
pilsetnieks Jan 31, 2015
4a28865
avoid undefined offset error
Apr 17, 2015
3880f98
v.0.5.9
pilsetnieks Apr 18, 2015
a982e9c
v.0.5.10.
pilsetnieks Apr 18, 2015
812bde0
v.0.5.11: XLSX text cells
pilsetnieks Apr 30, 2015
8842207
Fixed bug where first XSLX worksheet was inaccessible, after updating…
ChrisJasperse Apr 4, 2016
0333251
Merge branch 'feature/xslx'
ChrisJasperse Apr 4, 2016
ee473f1
SAF-58: Unlimited shared string cache.
ChrisJasperse Apr 4, 2016
d5e6cd9
SAF-58: Solve PHP7 Deprecation warnings
paales Apr 4, 2016
1c983ab
Another LibreOffice XLS fix
karlis-i Jun 2, 2016
1eb524d
Update composer.json
karlis-i Jun 2, 2016
577a35a
Update composer.json
karlis-i Jun 2, 2016
7fe9ff5
Get Sheet Id in various formats
TonisOrmisson Jun 8, 2016
90e9414
test
TonisOrmisson Nov 25, 2016
ecd2571
Fixed read for general format. If cell have general format, in her va…
aandolg Jan 10, 2018
e7c0887
merge pull request
Mar 9, 2018
7f2b823
Merge pull request #1 from karlis-i/master
fujaru Mar 9, 2018
0b7479a
Enabled FormatValue (number, date, etc) for xlsx
Mar 9, 2018
bb5a7b6
Merge branch 'master' of https://github.com/fujaru/spreadsheet-reader
Mar 9, 2018
c3ac66f
Merge branch 'fixed_read_for_general_format' of https://github.com/aa…
Mar 9, 2018
7e65933
Merge branch 'aandolg-fixed_read_for_general_format'
Mar 9, 2018
a3d8fc4
Merge remote-tracking branch 'upstream/master'
TonisOrmisson Jan 9, 2020
aa687f1
fix composer changes
TonisOrmisson Jan 9, 2020
ec3e81d
Fix use of `continue` statement under `switch case`
syncxplus Apr 16, 2020
54753a4
Fix currency format
syncxplus Jul 23, 2020
d0b3294
Don't format value as float if starts with 0
syncxplus Dec 1, 2020
c30ba56
ignore .idea
TonisOrmisson Apr 27, 2021
6c06bf2
Merge remote-tracking branch 'syncxplus/master'
TonisOrmisson Apr 27, 2021
a6420d4
Get options before creating handle
webportnoy Jul 4, 2022
e4d56f0
Example for options to readme
webportnoy Jul 4, 2022
bdfc946
XLSX: format value without Decimals
webportnoy Jul 22, 2022
3be2499
Apply encoding from Options for CSV files
webportnoy Aug 17, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
.DS_Store
test
materials
materials
.idea
24 changes: 14 additions & 10 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
### v.0.5.12 2016-03-18

- Added a fix for recognising dates in XLS files created by LibreOffice

### v.0.5.11 2015-04-30

- Added a special case for cells formatted as text in XLSX. Previously leading zeros would get truncated if a text cell contained only numbers.
Expand Down Expand Up @@ -50,10 +54,10 @@ Currently only decimal number values are converted to PHP's floats.

### v.0.5.1 2013-06-27

- Fixed file type choice when using mime-types (previously there were problems with
- Fixed file type choice when using mime-types (previously there were problems with
XLSX and ODS mime-types) (Thanks to [incratec](https://github.com/incratec))

- Fixed an error in XLSX iterator where `current()` would advance the iterator forward
- Fixed an error in XLSX iterator where `current()` would advance the iterator forward
with each call. (Thanks to [osuwariboy](https://github.com/osuwariboy))

### v.0.5.0 2013-06-17
Expand All @@ -62,19 +66,19 @@ with each call. (Thanks to [osuwariboy](https://github.com/osuwariboy))
- The `Sheets()` method lets you retrieve a list of all sheets present in the file.
- `ChangeSheet($Index)` method changes the sheet in the reader to the one specified.

- Previously temporary files that were extracted, were deleted after the SpreadsheetReader
was destroyed but the empty directories remained. Now those are cleaned up as well.
- Previously temporary files that were extracted, were deleted after the SpreadsheetReader
was destroyed but the empty directories remained. Now those are cleaned up as well.

### v.0.4.3 2013-06-14

- Bugfix for shared string caching in XLSX files. When the shared string count was larger
than the caching limit, instead of them being read from file, empty strings were returned.
- Bugfix for shared string caching in XLSX files. When the shared string count was larger
than the caching limit, instead of them being read from file, empty strings were returned.

### v.0.4.2 2013-06-02

- XLS file reading relies on the external Spreadsheet_Excel_Reader class which, by default,
reads additional information about cells like fonts, styles, etc. Now that is disabled
to save some memory since the style data is unnecessary anyway.
- XLS file reading relies on the external Spreadsheet_Excel_Reader class which, by default,
reads additional information about cells like fonts, styles, etc. Now that is disabled
to save some memory since the style data is unnecessary anyway.
(Thanks to [ChALkeR](https://github.com/ChALkeR) for the tip.)

Martins Pilsetnieks <pilsetnieks@gmail.com>
Martins Pilsetnieks <pilsetnieks@gmail.com>
9 changes: 9 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,15 @@ Example:
If a sheet is changed to the same that is currently open, the position in the file still reverts to the beginning, so as to conform
to the same behavior as when changed to a different sheet.

Passing options:

<?php
$Options = array(
'TempDir' => "/tmp"
);
$Reader = new SpreadsheetReader('example.xlsx', false, false, $Options);
?>

### Testing

From the command line:
Expand Down
9 changes: 6 additions & 3 deletions SpreadsheetReader.php
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ class SpreadsheetReader implements SeekableIterator, Countable
const TYPE_ODS = 'ODS';

private $Options = array(
'Delimiter' => '',
'Delimiter' => ';',
'Enclosure' => '"'
);

Expand All @@ -37,7 +37,7 @@ class SpreadsheetReader implements SeekableIterator, Countable
* @param string Original filename (in case of an uploaded file), used to determine file type, optional
* @param string MIME type from an upload, used to determine file type, optional
*/
public function __construct($Filepath, $OriginalFilename = false, $MimeType = false)
public function __construct($Filepath, $OriginalFilename = false, $MimeType = false, $Options = array())
{
if (!is_readable($Filepath))
{
Expand Down Expand Up @@ -158,12 +158,15 @@ public function __construct($Filepath, $OriginalFilename = false, $MimeType = fa
}
}

// Get options before creating handle
$this -> Options = array_merge($this -> Options, $Options);

// 2. Create handle
switch ($this -> Type)
{
case self::TYPE_XLSX:
self::Load(self::TYPE_XLSX);
$this -> Handle = new SpreadsheetReader_XLSX($Filepath);
$this -> Handle = new SpreadsheetReader_XLSX($Filepath, $this -> Options);
break;
case self::TYPE_CSV:
self::Load(self::TYPE_CSV);
Expand Down
66 changes: 39 additions & 27 deletions SpreadsheetReader_CSV.php
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,8 @@ class SpreadsheetReader_CSV implements Iterator, Countable
*/
private $Options = array(
'Delimiter' => ';',
'Enclosure' => '"'
'Enclosure' => '"',
'Encoding' => 'auto'
);

private $Encoding = 'UTF-8';
Expand Down Expand Up @@ -49,6 +50,43 @@ public function __construct($Filepath, array $Options = null)
$this -> Options = array_merge($this -> Options, $Options);
$this -> Handle = fopen($Filepath, 'r');

if( $this -> Options['Encoding'] == "auto" )
{
$this -> AutoDetectEncoding();
}
else{
$this -> Encoding = $this -> Options['Encoding'];
}

// Checking for the delimiter if it should be determined automatically
if (!$this -> Options['Delimiter'])
{
// fgetcsv needs single-byte separators
$Semicolon = ';';
$Tab = "\t";
$Comma = ',';

// Reading the first row and checking if a specific separator character
// has more columns than others (it means that most likely that is the delimiter).
$SemicolonCount = count(fgetcsv($this -> Handle, null, $Semicolon));
fseek($this -> Handle, $this -> BOMLength);
$TabCount = count(fgetcsv($this -> Handle, null, $Tab));
fseek($this -> Handle, $this -> BOMLength);
$CommaCount = count(fgetcsv($this -> Handle, null, $Comma));
fseek($this -> Handle, $this -> BOMLength);

$Delimiter = $Semicolon;
if ($TabCount > $SemicolonCount || $CommaCount > $SemicolonCount)
{
$Delimiter = $CommaCount > $TabCount ? $Comma : $Tab;
}

$this -> Options['Delimiter'] = $Delimiter;
}
}

private function AutoDetectEncoding()
{
// Checking the file for byte-order mark to determine encoding
$BOM16 = bin2hex(fread($this -> Handle, 2));
if ($BOM16 == 'fffe')
Expand Down Expand Up @@ -95,32 +133,6 @@ public function __construct($Filepath, array $Options = null)
{
fseek($this -> Handle, $this -> BOMLength);
}

// Checking for the delimiter if it should be determined automatically
if (!$this -> Options['Delimiter'])
{
// fgetcsv needs single-byte separators
$Semicolon = ';';
$Tab = "\t";
$Comma = ',';

// Reading the first row and checking if a specific separator character
// has more columns than others (it means that most likely that is the delimiter).
$SemicolonCount = count(fgetcsv($this -> Handle, null, $Semicolon));
fseek($this -> Handle, $this -> BOMLength);
$TabCount = count(fgetcsv($this -> Handle, null, $Tab));
fseek($this -> Handle, $this -> BOMLength);
$CommaCount = count(fgetcsv($this -> Handle, null, $Comma));
fseek($this -> Handle, $this -> BOMLength);

$Delimiter = $Semicolon;
if ($TabCount > $SemicolonCount || $CommaCount > $SemicolonCount)
{
$Delimiter = $CommaCount > $TabCount ? $Comma : $Tab;
}

$this -> Options['Delimiter'] = $Delimiter;
}
}

/**
Expand Down
46 changes: 33 additions & 13 deletions SpreadsheetReader_XLSX.php
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ class SpreadsheetReader_XLSX implements Iterator, Countable
* With large shared string caches there are huge performance gains, however a lot of memory could be used which
* can be a problem, especially on shared hosting.
*/
const SHARED_STRING_CACHE_LIMIT = 50000;
const SHARED_STRING_CACHE_LIMIT = null;

private $Options = array(
'TempDir' => '',
Expand Down Expand Up @@ -370,17 +370,29 @@ public function Sheets()
$this -> Sheets = array();
foreach ($this -> WorkbookXML -> sheets -> sheet as $Index => $Sheet)
{
$Attributes = $Sheet -> attributes('r', true);
foreach ($Attributes as $Name => $Value)
$AttributesWithPrefix = $Sheet -> attributes('r', true);
$Attributes = $Sheet -> attributes();

$rId = 0;
$sheetId = 0;

foreach ($AttributesWithPrefix as $Name => $Value)
{
if ($Name == 'id')
{
$SheetID = (int)str_replace('rId', '', (string)$Value);
$rId = (int)str_replace('rId', '', (string)$Value);
break;
}
}
foreach ($Attributes as $Name => $Value)
{
if ($Name == 'sheetId') {
$sheetId = (int)$Value;
break;
}
}

$this -> Sheets[$SheetID] = (string)$Sheet['name'];
$this -> Sheets[min($rId, $sheetId)] = (string)$Sheet['name'];
}
ksort($this -> Sheets);
}
Expand Down Expand Up @@ -453,7 +465,7 @@ private function PrepareSharedStringCache()
case 't':
if ($this -> SharedStrings -> nodeType == XMLReader::END_ELEMENT)
{
continue;
break;
}
$CacheValue .= $this -> SharedStrings -> readString();
break;
Expand Down Expand Up @@ -556,7 +568,7 @@ private function GetSharedString($Index)
$this -> SharedStrings -> next('si');
$this -> SSForwarded = true;
$this -> SharedStringIndex++;
continue;
break;
}
else
{
Expand All @@ -578,7 +590,7 @@ private function GetSharedString($Index)
case 't':
if ($this -> SharedStrings -> nodeType == XMLReader::END_ELEMENT)
{
continue;
break;
}
$Value .= $this -> SharedStrings -> readString();
break;
Expand Down Expand Up @@ -891,7 +903,7 @@ private function FormatValue($Value, $Index)
// Scaling
$Value = $Value / $Format['Scale'];

if (!empty($Format['MinWidth']) && $Format['Decimals'])
if (!empty($Format['MinWidth']))
{
if ($Format['Thousands'])
{
Expand All @@ -910,7 +922,7 @@ private function FormatValue($Value, $Index)
// Currency/Accounting
if ($Format['Currency'])
{
$Value = preg_replace('', $Format['Currency'], $Value);
$Value = preg_replace('/\[.+\]/', $Format['Currency'], $Value);
}
}

Expand All @@ -929,7 +941,7 @@ private function FormatValue($Value, $Index)
public function GeneralFormat($Value)
{
// Numeric format
if (is_numeric($Value))
if (is_numeric($Value) && $Value[0] != 0)
{
$Value = (float)$Value;
}
Expand Down Expand Up @@ -1046,7 +1058,7 @@ public function next()
// If it is a closing tag, skip it
if ($this -> Worksheet -> nodeType == XMLReader::END_ELEMENT)
{
continue;
break;
}

$StyleId = (int)$this -> Worksheet -> getAttribute('s');
Expand Down Expand Up @@ -1080,7 +1092,7 @@ public function next()
case 'is':
if ($this -> Worksheet -> nodeType == XMLReader::END_ELEMENT)
{
continue;
break;
}

$Value = $this -> Worksheet -> readString();
Expand All @@ -1099,6 +1111,14 @@ public function next()
{
$Value = $this -> GeneralFormat($Value);
}
elseif ($Value)
{
$Value = $this -> GeneralFormat($Value);
}
elseif ($Value)
{
$Value = $this -> GeneralFormat($Value);
}

$this -> CurrentRow[$Index] = $Value;
break;
Expand Down
2 changes: 0 additions & 2 deletions composer.json
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,6 @@
"description": "Spreadsheet reader library for Excel, OpenOffice and structured text files",
"keywords": ["spreadsheet", "xls", "xlsx", "ods", "csv", "excel", "openoffice"],
"homepage": "https://github.com/nuovo/spreadsheet-reader",
"version": "0.5.11",
"time": "2015-04-30",
"type": "library",
"license": ["MIT"],
"authors": [
Expand Down
Loading