Skip to content

Error while parsing a pdf #410

@xpilasneo4j

Description

@xpilasneo4j

What can I do to prevent that?
Using OpenAI, I give a pdf and a schema and no data is inserted as I get this error

ERROR:neo4j_graphrag.experimental.components.kg_writer:{code: Neo.ClientError.Statement.TypeError} {message: Property values can only be of primitive types or arrays thereof. Encountered: Map{2023 -> String("97709"), 2022 -> String("5575"), 2021 -> String("18788")}.}
neo4j.exceptions.GqlError: {gql_status: 22N01} {gql_status_description: error: data exception - invalid type. Expected the value Map{2023 -> String("97709"), 2022 -> String("5575"), 2021 -> String("18788")} to be of type BOOLEAN, STRING, INTEGER, FLOAT, DATE, LOCAL TIME, ZONED TIME, LOCAL DATETIME, ZONED DATETIME, DURATION or POINT, but was of type MAP NOT NULL.} {message: 22N01: Expected the value Map{2023 -> String("97709"), 2022 -> String("5575"), 2021 -> String("18788")} to be of type BOOLEAN, STRING, INTEGER, FLOAT, DATE, LOCAL TIME, ZONED TIME, LOCAL DATETIME, ZONED DATETIME, DURATION or POINT, but was of type MAP NOT NULL.} {diagnostic_record: {'_classification': 'CLIENT_ERROR', 'OPERATION': '', 'OPERATION_CODE': '0', 'CURRENT_SCHEMA': '/'}} {raw_classification: CLIENT_ERROR}

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Users\XavierPilas\AppData\Roaming\Python\Python313\site-packages\neo4j_graphrag\experimental\components\kg_writer.py", line 212, in run
    self._upsert_nodes(batch, lexical_graph_config)
    ~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\XavierPilas\AppData\Roaming\Python\Python313\site-packages\neo4j_graphrag\experimental\components\kg_writer.py", line 159, in _upsert_nodes
    self.driver.execute_query(
    ~~~~~~~~~~~~~~~~~~~~~~~~~^
        query,
        ^^^^^^
        parameters_=parameters,
        ^^^^^^^^^^^^^^^^^^^^^^^
        database_=self.neo4j_database,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "C:\Users\XavierPilas\AppData\Roaming\Python\Python313\site-packages\neo4j\_sync\driver.py", line 970, in execute_query
    return session._run_transaction(
           ~~~~~~~~~~~~~~~~~~~~~~~~^
        access_mode,
        ^^^^^^^^^^^^
    ...<3 lines>...
        {},
        ^^^
    )
    ^
  File "C:\Users\XavierPilas\AppData\Roaming\Python\Python313\site-packages\neo4j\_sync\work\session.py", line 583, in _run_transaction
    result = transaction_function(tx, *args, **kwargs)
  File "C:\Users\XavierPilas\AppData\Roaming\Python\Python313\site-packages\neo4j\_sync\driver.py", line 1307, in _work
    return transformer(res)
  File "C:\Users\XavierPilas\AppData\Roaming\Python\Python313\site-packages\neo4j\_sync\work\result.py", line 802, in to_eager_result
    self._buffer_all()
    ~~~~~~~~~~~~~~~~^^
  File "C:\Users\XavierPilas\AppData\Roaming\Python\Python313\site-packages\neo4j\_sync\work\result.py", line 459, in _buffer_all
    self._buffer()
    ~~~~~~~~~~~~^^
  File "C:\Users\XavierPilas\AppData\Roaming\Python\Python313\site-packages\neo4j\_sync\work\result.py", line 448, in _buffer
    for record in self:
                  ^^^^
  File "C:\Users\XavierPilas\AppData\Roaming\Python\Python313\site-packages\neo4j\_sync\work\result.py", line 398, in __iter__
    self._connection.fetch_message()
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^
  File "C:\Users\XavierPilas\AppData\Roaming\Python\Python313\site-packages\neo4j\_sync\io\_common.py", line 184, in inner
    func(*args, **kwargs)
    ~~~~^^^^^^^^^^^^^^^^^
  File "C:\Users\XavierPilas\AppData\Roaming\Python\Python313\site-packages\neo4j\_sync\io\_bolt.py", line 864, in fetch_message
    res = self._process_message(tag, fields)
  File "C:\Users\XavierPilas\AppData\Roaming\Python\Python313\site-packages\neo4j\_sync\io\_bolt5.py", line 1208, in _process_message
    response.on_failure(summary_metadata or {})
    ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\XavierPilas\AppData\Roaming\Python\Python313\site-packages\neo4j\_sync\io\_common.py", line 254, in on_failure
    raise self._hydrate_error(metadata)
neo4j.exceptions.CypherTypeError: {code: Neo.ClientError.Statement.TypeError} {message: Property values can only be of primitive types or arrays thereof. Encountered: Map{2023 -> String("97709"), 2022 -> String("5575"), 2021 -> String("18788")}.}
DEBUG:neo4j_graphrag.experimental.pipeline.pipeline:TASK FINISHED writer in 1.483685900006094 res={'status': <RunStatus.DONE: 'DONE'>, 'result': {'status': 'FAILURE', 'metadata': {'error': '{code: Neo.ClientError.Statement.TypeError} {message: Property values can only be of primitive types or arrays thereof. Encountered: Map{2023 -> String("97709"), 2022 -> String("5575"), 2021 -> String... (12 chars)'}}, 'timestamp': datetime.datetime(2025, 8, 19, 7, 58, 27, 795148, tzinfo=datetime.timezone.utc)}
DEBU

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions