Skip to content

Vector Support #1999

@Knalltuete5000

Description

@Knalltuete5000

Is your feature request related to a problem? Please describe.
Since mysql 9.0 and mariadb 11.7 they both support the VECTOR(n) column.

The MySqlConnector since the last commit does also support reading and writing to/from the VECTOR(n) column. They support float[], Memory<float> ReadOnlyMemory<foat> as the underlaying dotnet type to map to the vector-column. I have to add here the information, that there is no official release, I have directly included the project, to test the new capabilities.

I currently do not know the status of the MySql.Data driver, if they support already the vector column

Describe the solution you'd like
My current solution to support the vector column is to add specified annotations to the columns like the following to store a vector with the size of 384

[PrimaryKey(nameof(Id))]
class EmbeddingTest
{
    public Guid Id { get; set; }

    [Column(TypeName = "vector(384)")]
    public float[] Embedding { get; set; }
}

or without the Attribute using the OnModelCreating method

protected override void OnModelCreating(ModelBuilder modelBuilder)
{
    modelBuilder.Entity<EmbeddingTest>()
    .Property(x => x.Embedding)
    .HasColumnType("vector(384)")
}

In both cases when creating the migration method call .Annotation("MySql:CharSet", "utf8mb4") needs to be removed.
When executing the code the conversion/mapping form/to the vector/float[] is not supported.
I have also tried to change in the Designer-code as well as in the ContextModelSnapshot of the migration from b.PrimitiveCollection<float>("Embedding").IsRequired().HasColumnType("vector(384)") to b.Property<float[]>("Embedding").IsRequired().HasColumnType("vector(384)")

For PrimitiveCollection<float> the following exception is thrown

System.InvalidOperationException : The property 'float.Embedding' cannot be mapped as a collection since it does not implement 'IEnumerable<T>'.
InternalTypeBaseBuilder.PrimitiveCollection(Type propertyType, String propertyName, MemberInfo memberInfo, Nullable`1 typeConfigurationSource, Nullable`1 configurationSource)
InternalTypeBaseBuilder.PrimitiveCollection(Type propertyType, String propertyName, Nullable`1 typeConfigurationSource, Nullable`1 configurationSource)
InternalTypeBaseBuilder.PrimitiveCollection(Type propertyType, String propertyName, Nullable`1 configurationSource)
EntityTypeBuilder.PrimitiveCollection[TProperty](String propertyName)
<>c.<BuildModel>b__0_0(EntityTypeBuilder b) line 46
ModelBuilder.Entity(String name, Action`1 buildAction)
MySqlApplicationDbContextModelSnapshot.BuildModel(ModelBuilder modelBuilder) line 25
ModelSnapshot.CreateModel()
ModelSnapshot.get_Model()
Migrator.HasPendingModelChanges()

For the Property<float[]> the following exception is thrown

<Microsoft.EntityFrameworkCore.DbUpdateException: An error occurred while saving the entity changes. See the inner exception for details.
 ---> MySqlConnector.MySqlException (0x80004005): Value of type 'string, size: 4683' cannot be converted to 'vector' type.
   at MySqlConnector.Core.ServerSession.ReceiveReplyAsync(IOBehavior ioBehavior, CancellationToken cancellationToken) in /_/src/MySqlConnector/Core/ServerSession.cs:line 1081
   at MySqlConnector.Core.ResultSet.ReadResultSetHeaderAsync(IOBehavior ioBehavior) in /_/src/MySqlConnector/Core/ResultSet.cs:line 37
   at MySqlConnector.MySqlDataReader.ActivateResultSet(CancellationToken cancellationToken) in /_/src/MySqlConnector/MySqlDataReader.cs:line 131
   at MySqlConnector.MySqlDataReader.InitAsync(CommandListPosition commandListPosition, ICommandPayloadCreator payloadCreator, IDictionary`2 cachedProcedures, IMySqlCommand command, CommandBehavior behavior, Activity activity, IOBehavior ioBehavior, CancellationToken cancellationToken) in /_/src/MySqlConnector/MySqlDataReader.cs:line 487
   at MySqlConnector.Core.CommandExecutor.ExecuteReaderAsync(CommandListPosition commandListPosition, ICommandPayloadCreator payloadCreator, CommandBehavior behavior, Activity activity, IOBehavior ioBehavior, CancellationToken cancellationToken) in /_/src/MySqlConnector/Core/CommandExecutor.cs:line 56
   at MySqlConnector.MySqlCommand.ExecuteReaderAsync(CommandBehavior behavior, IOBehavior ioBehavior, CancellationToken cancellationToken) in /_/src/MySqlConnector/MySqlCommand.cs:line 357
   at MySqlConnector.MySqlCommand.ExecuteDbDataReaderAsync(CommandBehavior behavior, CancellationToken cancellationToken) in /_/src/MySqlConnector/MySqlCommand.cs:line 350
   at Microsoft.EntityFrameworkCore.Storage.RelationalCommand.ExecuteReaderAsync(RelationalCommandParameterObject parameterObject, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Storage.RelationalCommand.ExecuteReaderAsync(RelationalCommandParameterObject parameterObject, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Update.ReaderModificationCommandBatch.ExecuteAsync(IRelationalConnection connection, CancellationToken cancellationToken)
   --- End of inner exception stack trace ---
   at Microsoft.EntityFrameworkCore.Update.ReaderModificationCommandBatch.ExecuteAsync(IRelationalConnection connection, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Update.Internal.BatchExecutor.ExecuteAsync(IEnumerable`1 commandBatches, IRelationalConnection connection, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Update.Internal.BatchExecutor.ExecuteAsync(IEnumerable`1 commandBatches, IRelationalConnection connection, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Update.Internal.BatchExecutor.ExecuteAsync(IEnumerable`1 commandBatches, IRelationalConnection connection, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Storage.RelationalDatabase.SaveChangesAsync(IList`1 entries, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.ChangeTracking.Internal.StateManager.SaveChangesAsync(IList`1 entriesToSave, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.ChangeTracking.Internal.StateManager.SaveChangesAsync(StateManager stateManager, Boolean acceptAllChangesOnSuccess, CancellationToken cancellationToken)
   at Pomelo.EntityFrameworkCore.MySql.Storage.Internal.MySqlExecutionStrategy.ExecuteAsync[TState,TResult](TState state, Func`4 operation, Func`4 verifySucceeded, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.DbContext.SaveChangesAsync(Boolean acceptAllChangesOnSuccess, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.DbContext.SaveChangesAsync(Boolean acceptAllChangesOnSuccess, CancellationToken cancellationToken)

which indicates that the float[] is converted to a string to insert the embedding/vector into the database. Since the MySqlConnector does support the vector column in the code that I have used, the conversion is happening on the ef core side.

Or do I miss any special conversion to support the vector column at the current moment by adding, e.g. a conversion?

Describe alternatives you've considered
If the package would support the vector type nativly I would suggest that float[], Memory<float> and ReadOnlyMemory<foat> are automaticly mapped to the vector-column. The length can than be adjusted via the MaxLength-Attribute if the attribute based configuration is used

Additional context

  • I am currently using Pomelo.EntityFrameworkCore.MySql Version="9.0.0-preview.3.efcore.9.0.0" and the MySqlConnector from the github repository as this commit and including it directly in the project
  • MySql and MariaDb have different function to interact with vectors, e.g. to calculate the distance between two vectors:
    MySql: DISTANCE(vec1, vec1, method)
    MariaDb: VEC_DISTANCE_COSINE(vec1, vec2) or VEC_DISTANCE_EUCLIDEAN(vec1, vec2) or VEC_DISTANCE(vec1, vec2)
  • MariaDb also supports a new special vector index on a vector column, but only one vector index per table is allowed

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions