<tfoot id='aWoo8'></tfoot>

  1. <legend id='aWoo8'><style id='aWoo8'><dir id='aWoo8'><q id='aWoo8'></q></dir></style></legend>
  2. <small id='aWoo8'></small><noframes id='aWoo8'>

  3. <i id='aWoo8'><tr id='aWoo8'><dt id='aWoo8'><q id='aWoo8'><span id='aWoo8'><b id='aWoo8'><form id='aWoo8'><ins id='aWoo8'></ins><ul id='aWoo8'></ul><sub id='aWoo8'></sub></form><legend id='aWoo8'></legend><bdo id='aWoo8'><pre id='aWoo8'><center id='aWoo8'></center></pre></bdo></b><th id='aWoo8'></th></span></q></dt></tr></i><div id='aWoo8'><tfoot id='aWoo8'></tfoot><dl id='aWoo8'><fieldset id='aWoo8'></fieldset></dl></div>
      <bdo id='aWoo8'></bdo><ul id='aWoo8'></ul>

      如何使用 pyodbc 加速批量插入 MS SQL Server

      How to speed up bulk insert to MS SQL Server using pyodbc(如何使用 pyodbc 加速批量插入 MS SQL Server)
        <bdo id='KeWGe'></bdo><ul id='KeWGe'></ul>

        <legend id='KeWGe'><style id='KeWGe'><dir id='KeWGe'><q id='KeWGe'></q></dir></style></legend>
        <tfoot id='KeWGe'></tfoot>
            <i id='KeWGe'><tr id='KeWGe'><dt id='KeWGe'><q id='KeWGe'><span id='KeWGe'><b id='KeWGe'><form id='KeWGe'><ins id='KeWGe'></ins><ul id='KeWGe'></ul><sub id='KeWGe'></sub></form><legend id='KeWGe'></legend><bdo id='KeWGe'><pre id='KeWGe'><center id='KeWGe'></center></pre></bdo></b><th id='KeWGe'></th></span></q></dt></tr></i><div id='KeWGe'><tfoot id='KeWGe'></tfoot><dl id='KeWGe'><fieldset id='KeWGe'></fieldset></dl></div>
                  <tbody id='KeWGe'></tbody>

                <small id='KeWGe'></small><noframes id='KeWGe'>

                本文介绍了如何使用 pyodbc 加速批量插入 MS SQL Server的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

                问题描述

                以下是我需要帮助的代码.我必须运行超过 1,300,000 行,这意味着插入 ~300,000 行需要 40 分钟.

                Below is my code that I'd like some help with. I am having to run it over 1,300,000 rows meaning it takes up to 40 minutes to insert ~300,000 rows.

                我认为批量插入是加速它的途径?还是因为我通过 for data in reader: 部分遍历行?

                I figure bulk insert is the route to go to speed it up? Or is it because I'm iterating over the rows via for data in reader: portion?

                #Opens the prepped csv file
                with open (os.path.join(newpath,outfile), 'r') as f:
                    #hooks csv reader to file
                    reader = csv.reader(f)
                    #pulls out the columns (which match the SQL table)
                    columns = next(reader)
                    #trims any extra spaces
                    columns = [x.strip(' ') for x in columns]
                    #starts SQL statement
                    query = 'bulk insert into SpikeData123({0}) values ({1})'
                    #puts column names in SQL query 'query'
                    query = query.format(','.join(columns), ','.join('?' * len(columns)))
                
                    print 'Query is: %s' % query
                    #starts curser from cnxn (which works)
                    cursor = cnxn.cursor()
                    #uploads everything by row
                    for data in reader:
                        cursor.execute(query, data)
                        cursor.commit()
                

                我故意动态选择我的列标题(因为我想创建尽可能多的 Pythonic 代码).

                I am dynamically picking my column headers on purpose (as I would like to create the most pythonic code possible).

                SpikeData123 是表名.

                SpikeData123 is the table name.

                推荐答案

                更新 - 2021 年 7 月:bcpyaz 是 Microsoft 的 bcp 实用程序的包装器.

                Update - July 2021: bcpyaz is a wrapper for Microsoft's bcp utility.

                更新 - 2019 年 4 月:如@SimonLang 的评论所述,SQL Server 2017 及更高版本下的 BULK INSERT 显然支持 CSV 文件中的文本限定符(参考:此处).

                Update - April 2019: As noted in the comment from @SimonLang, BULK INSERT under SQL Server 2017 and later apparently does support text qualifiers in CSV files (ref: here).

                BULK INSERT 几乎肯定会比逐行读取源文件和对每一行执行常规 INSERT 快得多.但是,BULK INSERT 和 BCP 都对 CSV 文件有很大的限制,因为它们无法处理文本限定符(参考:此处).也就是说,如果您的 CSV 文件没有在其中包含限定的文本字符串...

                BULK INSERT will almost certainly be much faster than reading the source file row-by-row and doing a regular INSERT for each row. However, both BULK INSERT and BCP have a significant limitation regarding CSV files in that they cannot handle text qualifiers (ref: here). That is, if your CSV file does not have qualified text strings in it ...

                1,Gord Thompson,2015-04-15
                2,Bob Loblaw,2015-04-07
                

                ... 然后你可以批量插入它,但如果它包含文本限定符(因为某些文本值包含逗号)...

                ... then you can BULK INSERT it, but if it contains text qualifiers (because some text values contains commas) ...

                1,"Thompson, Gord",2015-04-15
                2,"Loblaw, Bob",2015-04-07
                

                ... 然后 BULK INSERT 无法处理它.尽管如此,将这样的 CSV 文件预处理为以管道分隔的文件总体上可能会更快......

                ... then BULK INSERT cannot handle it. Still, it might be faster overall to pre-process such a CSV file into a pipe-delimited file ...

                1|Thompson, Gord|2015-04-15
                2|Loblaw, Bob|2015-04-07
                

                ... 或制表符分隔的文件(其中 代表制表符)...

                ... or a tab-delimited file (where represents the tab character) ...

                1→Thompson, Gord→2015-04-15
                2→Loblaw, Bob→2015-04-07
                

                ... 然后批量插入该文件.对于后者(制表符分隔)文件,BULK INSERT 代码如下所示:

                ... and then BULK INSERT that file. For the latter (tab-delimited) file the BULK INSERT code would look something like this:

                import pypyodbc
                conn_str = "DSN=myDb_SQLEXPRESS;"
                cnxn = pypyodbc.connect(conn_str)
                crsr = cnxn.cursor()
                sql = """
                BULK INSERT myDb.dbo.SpikeData123
                FROM 'C:\__tmp\biTest.txt' WITH (
                    FIELDTERMINATOR='\t',
                    ROWTERMINATOR='\n'
                    );
                """
                crsr.execute(sql)
                cnxn.commit()
                crsr.close()
                cnxn.close()
                

                注意:如评论中所述,执行BULK INSERT 语句仅适用于SQL Server 实例可以直接读取源文件的情况.对于源文件位于远程客户端的情况,请参阅此答案.

                Note: As mentioned in a comment, executing a BULK INSERT statement is only applicable if the SQL Server instance can directly read the source file. For cases where the source file is on a remote client, see this answer.

                这篇关于如何使用 pyodbc 加速批量插入 MS SQL Server的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

                本站部分内容来源互联网,如果有图片或者内容侵犯了您的权益,请联系我们,我们会在确认后第一时间进行删除!

                相关文档推荐

                Select n random rows from SQL Server table(从 SQL Server 表中随机选择 n 行)
                SQL query to select dates between two dates(用于选择两个日期之间的日期的 SQL 查询)
                How can I delete using INNER JOIN with SQL Server?(如何在 SQL Server 中使用 INNER JOIN 进行删除?)
                Table Naming Dilemma: Singular vs. Plural Names(表命名困境:单数与复数名称)
                INSERT statement conflicted with the FOREIGN KEY constraint - SQL Server(INSERT 语句与 FOREIGN KEY 约束冲突 - SQL Server)
                Optimal way to concatenate/aggregate strings(连接/聚合字符串的最佳方式)

                  <small id='3Wyix'></small><noframes id='3Wyix'>

                  • <i id='3Wyix'><tr id='3Wyix'><dt id='3Wyix'><q id='3Wyix'><span id='3Wyix'><b id='3Wyix'><form id='3Wyix'><ins id='3Wyix'></ins><ul id='3Wyix'></ul><sub id='3Wyix'></sub></form><legend id='3Wyix'></legend><bdo id='3Wyix'><pre id='3Wyix'><center id='3Wyix'></center></pre></bdo></b><th id='3Wyix'></th></span></q></dt></tr></i><div id='3Wyix'><tfoot id='3Wyix'></tfoot><dl id='3Wyix'><fieldset id='3Wyix'></fieldset></dl></div>
                      <tbody id='3Wyix'></tbody>
                  • <legend id='3Wyix'><style id='3Wyix'><dir id='3Wyix'><q id='3Wyix'></q></dir></style></legend>

                      1. <tfoot id='3Wyix'></tfoot>
                          <bdo id='3Wyix'></bdo><ul id='3Wyix'></ul>