neo4j cypher_Neo4j:Cypher –避免热切
neo4j cypher
當(dāng)心渴望的管道
盡管我喜歡Cypher的LOAD CSV命令使它容易地將數(shù)據(jù)獲取到Neo4j中的方法,但它目前打破了最不驚奇的規(guī)則,因?yàn)樗鼻械卦谒行兄屑虞d某些查詢,即使是那些使用定期提交的查詢。
這是我的同事Michael在他的第二篇博客文章中指出的,它解釋了如何成功使用LOAD CSV :
即使遵循我之前的建議,人們遇到的最大問(wèn)題是,對(duì)于超過(guò)一百萬(wàn)行的大量導(dǎo)入,Cypher遇到了內(nèi)存不足的情況。
這與提交大小無(wú)關(guān) ,因此即使是小批量的PERIODIC COMMIT也會(huì)發(fā)生。
最近,我花了幾天時(shí)間在具有4GB RAM的Windows機(jī)器上將數(shù)據(jù)導(dǎo)入Neo4j,所以我比Michael建議的更早看到了這個(gè)問(wèn)題。
Michael解釋了如何確定您的查詢是否遭受意外的急切評(píng)估:
如果分析該查詢,則會(huì)看到查詢計(jì)劃中有一個(gè)“急切”步驟。
那就是“拉入所有數(shù)據(jù)”的地方。
您可以通過(guò)在單詞“ PROFILE”前面加上前綴來(lái)配置查詢。 您需要在Web瀏覽器的/ webadmin控制臺(tái)中或使用Neo4j shell運(yùn)行查詢。
我為查詢執(zhí)行了此操作,并且能夠識(shí)別得到快速評(píng)估的查詢模式,在某些情況下,我們可以解決該問(wèn)題。
我們將使用Northwind數(shù)據(jù)集來(lái)演示Eager管道如何潛入我們的查詢中,但請(qǐng)記住,該數(shù)據(jù)集足夠小,不會(huì)引起問(wèn)題。
這是文件中的行的樣子:
$ head -n 2 data/customerDb.csv OrderID,CustomerID,EmployeeID,OrderDate,RequiredDate,ShippedDate,ShipVia,Freight,ShipName,ShipAddress,ShipCity,ShipRegion,ShipPostalCode,ShipCountry,CustomerID,CustomerCompanyName,ContactName,ContactTitle,Address,City,Region,PostalCode,Country,Phone,Fax,EmployeeID,LastName,FirstName,Title,TitleOfCourtesy,BirthDate,HireDate,Address,City,Region,PostalCode,Country,HomePhone,Extension,Photo,Notes,ReportsTo,PhotoPath,OrderID,ProductID,UnitPrice,Quantity,Discount,ProductID,ProductName,SupplierID,CategoryID,QuantityPerUnit,UnitPrice,UnitsInStock,UnitsOnOrder,ReorderLevel,Discontinued,SupplierID,SupplierCompanyName,ContactName,ContactTitle,Address,City,Region,PostalCode,Country,Phone,Fax,HomePage,CategoryID,CategoryName,Description,Picture 10248,VINET,5,1996-07-04,1996-08-01,1996-07-16,3,32.38,Vins et alcools Chevalier,59 rue de l'Abbaye,Reims,,51100,France,VINET,Vins et alcools Chevalier,Paul Henriot,Accounting Manager,59 rue de l'Abbaye,Reims,,51100,France,26.47.15.10,26.47.15.11,5,Buchanan,Steven,Sales Manager,Mr.,1955-03-04,1993-10-17,14 Garrett Hill,London,,SW1 8JR,UK,(71) 555-4848,3453,\x,"Steven Buchanan graduated from St. Andrews University, Scotland, with a BSC degree in 1976. Upon joining the company as a sales representative in 1992, he spent 6 months in an orientation program at the Seattle office and then returned to his permanent post in London. He was promoted to sales manager in March 1993. Mr. Buchanan has completed the courses ""Successful Telemarketing"" and ""International Sales Management."" He is fluent in French.",2,http://accweb/emmployees/buchanan.bmp,10248,11,14,12,0,11,Queso Cabrales,5,4,1 kg pkg.,21,22,30,30,0,5,Cooperativa de Quesos 'Las Cabras',Antonio del Valle Saavedra,Export Administrator,Calle del Rosal 4,Oviedo,Asturias,33007,Spain,(98) 598 76 54,,,4,Dairy Products,Cheeses,\x合并,合并,合并
我們要做的第一件事是為每個(gè)員工和每個(gè)訂單創(chuàng)建一個(gè)節(jié)點(diǎn),然后在它們之間創(chuàng)建一個(gè)關(guān)系。
我們可以從以下查詢開始:
USING PERIODIC COMMIT 1000 LOAD CSV WITH HEADERS FROM "file:/Users/markneedham/projects/neo4j-northwind/data/customerDb.csv" AS row MERGE (employee:Employee {employeeId: row.EmployeeID}) MERGE (order:Order {orderId: row.OrderID}) MERGE (employee)-[:SOLD]->(order)這樣就可以了,但是如果我們像這樣對(duì)查詢進(jìn)行概要分析……
PROFILE LOAD CSV WITH HEADERS FROM "file:/Users/markneedham/projects/neo4j-northwind/data/customerDb.csv" AS row WITH row LIMIT 0 MERGE (employee:Employee {employeeId: row.EmployeeID}) MERGE (order:Order {orderId: row.OrderID}) MERGE (employee)-[:SOLD]->(order)…我們會(huì)在第三行看到“渴望”:
==> +----------------+------+--------+----------------------------------+-----------------------------------------+ ==> | Operator | Rows | DbHits | Identifiers | Other | ==> +----------------+------+--------+----------------------------------+-----------------------------------------+ ==> | EmptyResult | 0 | 0 | | | ==> | UpdateGraph(0) | 0 | 0 | employee, order, UNNAMED216 | MergePattern | ==> | Eager | 0 | 0 | | | ==> | UpdateGraph(1) | 0 | 0 | employee, employee, order, order | MergeNode; :Employee; MergeNode; :Order | ==> | Slice | 0 | 0 | | { AUTOINT0} | ==> | LoadCSV | 1 | 0 | row | | ==> +----------------+------+--------+----------------------------------+-----------------------------------------+您會(huì)注意到,當(dāng)我們分析每個(gè)查詢時(shí),我們將刪除定期提交部分,并添加“ WITH row LIMIT 0”。 這使我們能夠生成足夠的查詢計(jì)劃來(lái)標(biāo)識(shí)“急切”運(yùn)算符,而無(wú)需實(shí)際導(dǎo)入任何數(shù)據(jù)。
我們希望將該查詢分為兩個(gè)查詢,以便可以不急于處理它:
USING PERIODIC COMMIT 1000 LOAD CSV WITH HEADERS FROM "file:/Users/markneedham/projects/neo4j-northwind/data/customerDb.csv" AS row WITH row LIMIT 0 MERGE (employee:Employee {employeeId: row.EmployeeID}) MERGE (order:Order {orderId: row.OrderID})==> +-------------+------+--------+----------------------------------+-----------------------------------------+ ==> | Operator | Rows | DbHits | Identifiers | Other | ==> +-------------+------+--------+----------------------------------+-----------------------------------------+ ==> | EmptyResult | 0 | 0 | | | ==> | UpdateGraph | 0 | 0 | employee, employee, order, order | MergeNode; :Employee; MergeNode; :Order | ==> | Slice | 0 | 0 | | { AUTOINT0} | ==> | LoadCSV | 1 | 0 | row | | ==> +-------------+------+--------+----------------------------------+-----------------------------------------+現(xiàn)在我們已經(jīng)創(chuàng)建了員工和訂單,我們可以將他們加入在一起:
USING PERIODIC COMMIT 1000 LOAD CSV WITH HEADERS FROM "file:/Users/markneedham/projects/neo4j-northwind/data/customerDb.csv" AS row MATCH (employee:Employee {employeeId: row.EmployeeID}) MATCH (order:Order {orderId: row.OrderID}) MERGE (employee)-[:SOLD]->(order)==> +----------------+------+--------+-------------------------------+-----------------------------------------------------------+ ==> | Operator | Rows | DbHits | Identifiers | Other | ==> +----------------+------+--------+-------------------------------+-----------------------------------------------------------+ ==> | EmptyResult | 0 | 0 | | | ==> | UpdateGraph | 0 | 0 | employee, order, UNNAMED216 | MergePattern | ==> | Filter(0) | 0 | 0 | | Property(order,orderId) == Property(row,OrderID) | ==> | NodeByLabel(0) | 0 | 0 | order, order | :Order | ==> | Filter(1) | 0 | 0 | | Property(employee,employeeId) == Property(row,EmployeeID) | ==> | NodeByLabel(1) | 0 | 0 | employee, employee | :Employee | ==> | Slice | 0 | 0 | | { AUTOINT0} | ==> | LoadCSV | 1 | 0 | row | | ==> +----------------+------+--------+-------------------------------+-----------------------------------------------------------+眼中沒(méi)有渴望!
比賽,比賽,比賽,合并,合并
如果我們快進(jìn)幾步,我們現(xiàn)在可能已經(jīng)將導(dǎo)入腳本重構(gòu)到了我們?cè)谝粋€(gè)查詢中創(chuàng)建節(jié)點(diǎn)并在另一個(gè)查詢中創(chuàng)建關(guān)系的地步。
我們的create查詢按預(yù)期工作:
USING PERIODIC COMMIT 1000 LOAD CSV WITH HEADERS FROM "file:/Users/markneedham/projects/neo4j-northwind/data/customerDb.csv" AS row MERGE (employee:Employee {employeeId: row.EmployeeID}) MERGE (order:Order {orderId: row.OrderID}) MERGE (product:Product {productId: row.ProductID})==> +-------------+------+--------+----------------------------------------------------+--------------------------------------------------------------+ ==> | Operator | Rows | DbHits | Identifiers | Other | ==> +-------------+------+--------+----------------------------------------------------+--------------------------------------------------------------+ ==> | EmptyResult | 0 | 0 | | | ==> | UpdateGraph | 0 | 0 | employee, employee, order, order, product, product | MergeNode; :Employee; MergeNode; :Order; MergeNode; :Product | ==> | Slice | 0 | 0 | | { AUTOINT0} | ==> | LoadCSV | 1 | 0 | row | | ==> +-------------+------+--------+----------------------------------------------------+------------------------------------------------------------現(xiàn)在,我們?cè)趫D表中有了員工,產(chǎn)品和訂單。 現(xiàn)在讓我們?cè)谌咧g建立關(guān)系:
USING PERIODIC COMMIT 1000 LOAD CSV WITH HEADERS FROM "file:/Users/markneedham/projects/neo4j-northwind/data/customerDb.csv" AS row MATCH (employee:Employee {employeeId: row.EmployeeID}) MATCH (order:Order {orderId: row.OrderID}) MATCH (product:Product {productId: row.ProductID}) MERGE (employee)-[:SOLD]->(order) MERGE (order)-[:PRODUCT]->(product)如果我們描述,我們會(huì)發(fā)現(xiàn)Eager再次潛入了!
==> +----------------+------+--------+-------------------------------+-----------------------------------------------------------+ ==> | Operator | Rows | DbHits | Identifiers | Other | ==> +----------------+------+--------+-------------------------------+-----------------------------------------------------------+ ==> | EmptyResult | 0 | 0 | | | ==> | UpdateGraph(0) | 0 | 0 | order, product, UNNAMED318 | MergePattern | ==> | Eager | 0 | 0 | | | ==> | UpdateGraph(1) | 0 | 0 | employee, order, UNNAMED287 | MergePattern | ==> | Filter(0) | 0 | 0 | | Property(product,productId) == Property(row,ProductID) | ==> | NodeByLabel(0) | 0 | 0 | product, product | :Product | ==> | Filter(1) | 0 | 0 | | Property(order,orderId) == Property(row,OrderID) | ==> | NodeByLabel(1) | 0 | 0 | order, order | :Order | ==> | Filter(2) | 0 | 0 | | Property(employee,employeeId) == Property(row,EmployeeID) | ==> | NodeByLabel(2) | 0 | 0 | employee, employee | :Employee | ==> | Slice | 0 | 0 | | { AUTOINT0} | ==> | LoadCSV | 1 | 0 | row | | ==> +----------------+------+--------+-------------------------------+-----------------------------------------------------------+在這種情況下,Eager發(fā)生在我們第二次致電MERGE時(shí),正如Michael在他的帖子中指出的:
問(wèn)題是,在單個(gè)Cypher語(yǔ)句中,您必須隔離會(huì)進(jìn)一步影響匹配的更改,例如,當(dāng)您創(chuàng)建帶有標(biāo)簽的節(jié)點(diǎn)時(shí),該標(biāo)簽突然被以后的MATCH或MERGE操作所匹配。
在這種情況下,我們可以通過(guò)使用單獨(dú)的查詢來(lái)創(chuàng)建關(guān)系來(lái)解決該問(wèn)題:
LOAD CSV WITH HEADERS FROM "file:/Users/markneedham/projects/neo4j-northwind/data/customerDb.csv" AS row MATCH (employee:Employee {employeeId: row.EmployeeID}) MATCH (order:Order {orderId: row.OrderID}) MERGE (employee)-[:SOLD]->(order)==> +----------------+------+--------+-------------------------------+-----------------------------------------------------------+ ==> | Operator | Rows | DbHits | Identifiers | Other | ==> +----------------+------+--------+-------------------------------+-----------------------------------------------------------+ ==> | EmptyResult | 0 | 0 | | | ==> | UpdateGraph | 0 | 0 | employee, order, UNNAMED236 | MergePattern | ==> | Filter(0) | 0 | 0 | | Property(order,orderId) == Property(row,OrderID) | ==> | NodeByLabel(0) | 0 | 0 | order, order | :Order | ==> | Filter(1) | 0 | 0 | | Property(employee,employeeId) == Property(row,EmployeeID) | ==> | NodeByLabel(1) | 0 | 0 | employee, employee | :Employee | ==> | Slice | 0 | 0 | | { AUTOINT0} | ==> | LoadCSV | 1 | 0 | row | | ==> +----------------+------+--------+-------------------------------+-----------------------------------------------------------+USING PERIODIC COMMIT 1000 LOAD CSV WITH HEADERS FROM "file:/Users/markneedham/projects/neo4j-northwind/data/customerDb.csv" AS row MATCH (order:Order {orderId: row.OrderID}) MATCH (product:Product {productId: row.ProductID}) MERGE (order)-[:PRODUCT]->(product)==> +----------------+------+--------+------------------------------+--------------------------------------------------------+ ==> | Operator | Rows | DbHits | Identifiers | Other | ==> +----------------+------+--------+------------------------------+--------------------------------------------------------+ ==> | EmptyResult | 0 | 0 | | | ==> | UpdateGraph | 0 | 0 | order, product, UNNAMED229 | MergePattern | ==> | Filter(0) | 0 | 0 | | Property(product,productId) == Property(row,ProductID) | ==> | NodeByLabel(0) | 0 | 0 | product, product | :Product | ==> | Filter(1) | 0 | 0 | | Property(order,orderId) == Property(row,OrderID) | ==> | NodeByLabel(1) | 0 | 0 | order, order | :Order | ==> | Slice | 0 | 0 | | { AUTOINT0} | ==> | LoadCSV | 1 | 0 | row | | ==> +----------------+------+--------+------------------------------+--------------------------------------------------------+合并,設(shè)置
我嘗試使LOAD CSV腳本盡可能地冪等,這樣,如果我們將更多行或更多列的數(shù)據(jù)添加到CSV中,我們可以重新運(yùn)行查詢而不必重新創(chuàng)建所有內(nèi)容。
這可以引導(dǎo)您進(jìn)入以下創(chuàng)建供應(yīng)商的模式:
USING PERIODIC COMMIT 1000 LOAD CSV WITH HEADERS FROM "file:/Users/markneedham/projects/neo4j-northwind/data/customerDb.csv" AS row MERGE (supplier:Supplier {supplierId: row.SupplierID}) SET supplier.companyName = row.SupplierCompanyName我們要確保只有一個(gè)具有該SupplierID的Supplier,但是我們可能會(huì)逐步添加新屬性,并決定使用'SET'命令替換所有內(nèi)容。 如果我們分析該查詢,則“渴望”會(huì)潛伏:
==> +----------------+------+--------+--------------------+----------------------+ ==> | Operator | Rows | DbHits | Identifiers | Other | ==> +----------------+------+--------+--------------------+----------------------+ ==> | EmptyResult | 0 | 0 | | | ==> | UpdateGraph(0) | 0 | 0 | | PropertySet | ==> | Eager | 0 | 0 | | | ==> | UpdateGraph(1) | 0 | 0 | supplier, supplier | MergeNode; :Supplier | ==> | Slice | 0 | 0 | | { AUTOINT0} | ==> | LoadCSV | 1 | 0 | row | | ==> +----------------+------+--------+--------------------+----------------------+我們可以使用“ ON CREATE SET”和“ ON MATCH SET”以一些重復(fù)的代價(jià)來(lái)解決此問(wèn)題:
USING PERIODIC COMMIT 1000 LOAD CSV WITH HEADERS FROM "file:/Users/markneedham/projects/neo4j-northwind/data/customerDb.csv" AS row MERGE (supplier:Supplier {supplierId: row.SupplierID}) ON CREATE SET supplier.companyName = row.SupplierCompanyName ON MATCH SET supplier.companyName = row.SupplierCompanyName==> +-------------+------+--------+--------------------+----------------------+ ==> | Operator | Rows | DbHits | Identifiers | Other | ==> +-------------+------+--------+--------------------+----------------------+ ==> | EmptyResult | 0 | 0 | | | ==> | UpdateGraph | 0 | 0 | supplier, supplier | MergeNode; :Supplier | ==> | Slice | 0 | 0 | | { AUTOINT0} | ==> | LoadCSV | 1 | 0 | row | | ==> +-------------+------+--------+--------------------+----------------------+使用我一直在使用的數(shù)據(jù)集,在某些情況下可以避免OutOfMemory異常,而在其他情況下,可以將運(yùn)行查詢所花費(fèi)的時(shí)間減少3倍。
隨著時(shí)間的流逝,我希望所有這些情況都將得到解決,但是從Neo4j 2.1.5開始,這些是我已經(jīng)確定過(guò)急的模式。
如果您知道其他任何人,請(qǐng)告訴我,我可以將其添加到帖子中或撰寫第二部分。
翻譯自: https://www.javacodegeeks.com/2014/10/neo4j-cypher-avoiding-the-eager.html
neo4j cypher
總結(jié)
以上是生活随笔為你收集整理的neo4j cypher_Neo4j:Cypher –避免热切的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: jsf 单元测试_构建和测试JSF.ne
- 下一篇: cad设置标注(cad设置标注样式)