本文是小编为大家收集整理的关于UNION ALL参数化查询的处理/解决方法,可以参考本文帮助大家快速定位并解决问题,中文翻译不准确的可切换到English标签页查看源文。
问题描述
我有一个可以正常工作的查询.问题在于该查询的一部分是需要从文件读取的字符串.每个字符串的查询都会产生6个输出.我需要该文件的所有结果的结合,以便最终结果是一个单字符串数量的表格.我可以使用Python读取该文件.
我已经尝试使用参数化查询.他们每个人只基于字符串返回6行.
我的大部分Python代码都基于BigQuery的文档在这里./p>
query = """ SELECT pet_id, age, name FROM `myproject.mydataset.mytable` WHERE name = @name AND species = @species; """ query_params = [ bigquery.ScalarQueryParameter('name', 'STRING', 'Max'), bigquery.ScalarQueryParameter('species', 'INT64', 'Dog'), bigquery.ScalarQueryParameter('name', 'STRING', 'Alfred'), bigquery.ScalarQueryParameter('species', 'INT64', 'Cat') ] job_config = bigquery.QueryJobConfig() job_config.query_parameters = query_params query_job = client.query( query, # Location must match that of the dataset(s) referenced in the query. location='US', job_config=job_config) # API request - starts the query # Print the results for row in query_job: print('{}: \t{}'.format(row.word, row.word_count))
我如何获得所有这些查询结果中的所有联合?
输出应该看起来像
pet_id | age | name ___________________ 1 | 5 | Max 2 | 8 | Alfred
推荐答案
请使用公共数据查看以下示例(您也可以运行查询)
#standardSQL SELECT * FROM `bigquery-public-data.baseball.schedules` WHERE (year, duration_minutes) IN UNNEST([(2016, 187), (2016, 165), (2016, 189)])
这里的关键是要提供一个要过滤表的价值数组,并在unnest(array_of_values)中使用 来完成工作,理想情况下如下:
query = """ SELECT pet_id, age, name FROM `myproject.mydataset.mytable` WHERE (name, species) IN UNNEST(@filter_array); """
BigQuery Python API不允许您将array< struct<string, int64> >指定为查询参数,这有点不幸.因此,您可能必须这样做:
query = """ SELECT pet_id, age, name FROM `myproject.mydataset.mytable` WHERE concat(name, "_", species) IN UNNEST(@filter_array); """ array_of_pre_concatenated_name_and_species = ['Max_Dog', 'Alfred_Cat'] query_params = [ bigquery.ArrayQueryParameter('filter_array', 'STRING', array_of_pre_concatenated_name_and_species), ]
问题描述
I have a certain query which is working fine. The problem is that a part of that query is a string that needs to be read from a file. Query for each string produces 6 outputs. I need a union of all the results for that file such that the end result is a table fo 6x number of strings. I can read the file using Python.
I've already tried using parameterised queries. Each of them only return the 6 rows based on the string.
Most of my Python code is based on BigQuery's documentation here.
query = """ SELECT pet_id, age, name FROM `myproject.mydataset.mytable` WHERE name = @name AND species = @species; """ query_params = [ bigquery.ScalarQueryParameter('name', 'STRING', 'Max'), bigquery.ScalarQueryParameter('species', 'INT64', 'Dog'), bigquery.ScalarQueryParameter('name', 'STRING', 'Alfred'), bigquery.ScalarQueryParameter('species', 'INT64', 'Cat') ] job_config = bigquery.QueryJobConfig() job_config.query_parameters = query_params query_job = client.query( query, # Location must match that of the dataset(s) referenced in the query. location='US', job_config=job_config) # API request - starts the query # Print the results for row in query_job: print('{}: \t{}'.format(row.word, row.word_count))
How can I get a UNION ALL of many of these query results?
The output should look like
pet_id | age | name ___________________ 1 | 5 | Max 2 | 8 | Alfred
推荐答案
Please look at below example using public data (you can run the query as well)
#standardSQL SELECT * FROM `bigquery-public-data.baseball.schedules` WHERE (year, duration_minutes) IN UNNEST([(2016, 187), (2016, 165), (2016, 189)])
The key here is for you to provide an array of value that you want to filter the table with, and use IN UNNEST(array_of_values) to do the job, ideally like below:
query = """ SELECT pet_id, age, name FROM `myproject.mydataset.mytable` WHERE (name, species) IN UNNEST(@filter_array); """
It is a bit unfortunate that BigQuery Python API doesn't let you specify array< struct<string, int64> > as query parameter. So you may have to do:
query = """ SELECT pet_id, age, name FROM `myproject.mydataset.mytable` WHERE concat(name, "_", species) IN UNNEST(@filter_array); """ array_of_pre_concatenated_name_and_species = ['Max_Dog', 'Alfred_Cat'] query_params = [ bigquery.ArrayQueryParameter('filter_array', 'STRING', array_of_pre_concatenated_name_and_species), ]