PL/SQL Practices On BULK COLLECT limit

简介: Best practices for knowing your LIMIT and kicking %NOTFOUND I have started using BULK COLLECT whenever I need to fetch large volumes of data.

Best practices for knowing your LIMIT and kicking %NOTFOUND

I have started using BULK COLLECT whenever I need to fetch large volumes of data. This has caused me some trouble with my DBA, however. He is complaining that although my programs might be running much faster, they are also consuming way too much memory. He refuses to approve them for a production rollout. What's a programmer to do?

The most important thing to remember when you learn about and start to take advantage of features such as BULK COLLECT is that there is no free lunch. There is almost always a trade-off to be made somewhere. The tradeoff with BULK COLLECT, like so many other performance-enhancing features, is "run faster but consume more memory."

Specifically, memory for collections is stored in the program global area (PGA), not the system global area (SGA). SGA memory is shared by all sessions connected to Oracle Database, but PGA memory is allocated for each session. Thus, if a program requires 5MB of memory to populate a collection and there are 100 simultaneous connections, that program causes the consumption of 500MB of PGA memory, in addition to the memory allocated to the SGA.

Fortunately, PL/SQL makes it easy for developers to control the amount of memory used in a BULK COLLECT operation by using the LIMIT clause.

Suppose I need to retrieve all the rows from the employees table and then perform some compensation analysis on each row. I can use BULK COLLECT as follows:

PROCEDURE process_all_rows
IS
TYPE employees_aat
IS TABLE OF employees%ROWTYPE
INDEX BY PLS_INTEGER;
l_employees employees_aat;
BEGIN
SELECT *
BULK COLLECT INTO l_employees
FROM employees;
FOR indx IN 1 .. l_employees.COUNT
LOOP
analyze_compensation
(l_employees(indx));
END LOOP;
END process_all_rows;

 

Very concise, elegant, and efficient code. If, however, my employees table contains tens of thousands of rows, each of which contains hundreds of columns, this program can cause excessive PGA memory consumption.

Consequently, you should avoid this sort of "unlimited" use of BULK COLLECT. Instead, move the SELECT statement into an explicit cursor declaration and then use a simple loop to fetch many, but not all, rows from the table with each execution of the loop body, as shown in Listing 1.

Code Listing 1: Using BULK COLLECT with LIMIT clause

PROCEDURE process_all_rows (limit_in IN PLS_INTEGER DEFAULT 100)
IS
CURSOR employees_cur
IS
SELECT * FROM employees;
TYPE employees_aat IS TABLE OF employees_cur%ROWTYPE
INDEX BY PLS_INTEGER;
l_employees employees_aat;
BEGIN
OPEN employees_cur;
LOOP
FETCH employees_cur
BULK COLLECT INTO l_employees LIMIT limit_in;
FOR indx IN 1 .. l_employees.COUNT
LOOP
analyze_compensation (l_employees(indx));
END LOOP;
EXIT WHEN l_employees.COUNT < limit_in;
END LOOP;
CLOSE employees_cur;
END process_all_rows;

 

The process_all_rows procedure in Listing 1 requests that up to the value of limit_in rows be fetched at a time. PL/SQL will reuse the same limit_in elements in the collection each time the data is fetched and thus also reuse the same memory. Even if my table grows in size, the PGA consumption will remain stable.

How do you decide what number to use in the LIMIT clause? Theoretically, you will want to figure out how much memory you can afford to consume in the PGA and then adjust the limit to be as close to that amount as possible.

From tests I (and others) have performed, however, it appears that you will see roughly the same performance no matter what value you choose for the limit, as long as it is at least 25. The test_diff_limits.sql script, included with the sample code for this column, at otn.oracle.com/oramag/oracle/08-mar/o28plsql.zip, demonstrates this behavior, using the ALL_SOURCE data dictionary view on an Oracle Database 11g instance. Here are the results I saw (in hundredths of seconds) when fetching all the rows (a total of 470,000):

Elapsed CPU time for limit of 1 = 1839
Elapsed CPU time for limit of 5 = 716
Elapsed CPU time for limit of 25 = 539
Elapsed CPU time for limit of 50 = 545
Elapsed CPU time for limit of 75 = 489
Elapsed CPU time for limit of 100 = 490
Elapsed CPU time for limit of 1000 = 501
Elapsed CPU time for limit of 10000 = 478
Elapsed CPU time for limit of 100000 = 527

 

Kicking the %NOTFOUND Habit

I was very happy to learn that Oracle Database 10g will automatically optimize my cursor FOR loops to perform at speeds comparable to BULK COLLECT. Unfortunately, my company is still running on Oracle9i Database, so I have started converting my cursor FOR loops to BULK COLLECTs. I have run into a problem: I am using a LIMIT of 100, and my query retrieves a total of 227 rows, but my program processes only 200 of them. [The query is shown in Listing 2.] What am I doing wrong?

Code Listing 2: BULK COLLECT, %NOTFOUND, and missing rows

PROCEDURE process_all_rows
IS
CURSOR table_with_227_rows_cur
IS
SELECT * FROM table_with_227_rows;
TYPE table_with_227_rows_aat IS
TABLE OF table_with_227_rows_cur%ROWTYPE
INDEX BY PLS_INTEGER;
l_table_with_227_rows table_with_227_rows_aat;
BEGIN
OPEN table_with_227_rows_cur;
LOOP
FETCH table_with_227_rows_cur
BULK COLLECT INTO l_table_with_227_rows LIMIT 100;
EXIT WHEN table_with_227_rows_cur%NOTFOUND;     /* cause of missing rows */
FOR indx IN 1 .. l_table_with_227_rows.COUNT
LOOP
analyze_compensation (l_table_with_227_rows(indx));
END LOOP;
END LOOP;
CLOSE table_with_227_rows_cur;
END process_all_rows;

 

You came so close to a completely correct conversion from your cursor FOR loop to BULK COLLECT! Your only mistake was that you didn't give up the habit of using the %NOTFOUND cursor attribute in your EXIT WHEN clause.

The statement

EXIT WHEN
table_with_227_rows_cur%NOTFOUND;

 

makes perfect sense when you are fetching your data one row at a time. With BULK COLLECT, however, that line of code can result in incomplete data processing, precisely as you described.

Let's examine what is happening when you run your program and why those last 27 rows are left out. After opening the cursor and entering the loop, here is what occurs:

1. The fetch statement retrieves rows 1 through 100.
2. table_with_227_rows_cur%NOTFOUND evaluates to FALSE, and the rows are processed.
3. The fetch statement retrieves rows 101 through 200.
4. table_with_227_rows_cur%NOTFOUND evaluates to FALSE, and the rows are processed.
5. The fetch statement retrieves rows 201 through 227.
6. table_with_227_rows_cur%NOTFOUND evaluates to TRUE, and the loop is terminated—with 27 rows left to process!

Next Steps

 

READ more Best Practice PL/SQL

DOWNLOAD
Oracle Database 11g
sample code for this column

When you are using BULK COLLECT and collections to fetch data from your cursor, you should never rely on the cursor attributes to decide whether to terminate your loop and data processing.

So, to make sure that your query processes all 227 rows, replace this statement:

EXIT WHEN
table_with_227_rows_cur%NOTFOUND;
with
EXIT WHEN
l_table_with_227_rows.COUNT = 0;

 

Generally, you should keep all of the following in mind when working with BULK COLLECT:

  • The collection is always filled sequentially, starting from index value 1.
  • It is always safe (that is, you will never raise a NO_DATA_FOUND exception) to iterate through a collection from 1 to collection.COUNT when it has been filled with BULK COLLECT.
  • The collection is empty when no rows are fetched.
  • Always check the contents of the collection (with the COUNT method) to see if there are more rows to process.
  • Ignore the values returned by the cursor attributes, especially %NOTFOUND.
目录
相关文章
|
2天前
|
SQL 存储 Oracle
Oracle的PL/SQL定义变量和常量:数据的稳定与灵动
【4月更文挑战第19天】在Oracle PL/SQL中,变量和常量扮演着数据存储的关键角色。变量是可变的“魔术盒”,用于存储程序运行时的动态数据,通过`DECLARE`定义,可在循环和条件判断中体现其灵活性。常量则是不可变的“固定牌”,一旦设定值便保持不变,用`CONSTANT`声明,提供程序稳定性和易维护性。通过 `%TYPE`、`NOT NULL`等特性,可以更高效地管理和控制变量与常量,提升代码质量。善用两者,能优化PL/SQL程序的结构和性能。
|
2天前
|
SQL Perl
PL/SQL经典练习
PL/SQL经典练习
15 0
|
2天前
|
SQL Perl
PL/SQL编程基本概念
PL/SQL编程基本概念
18 0
|
2天前
|
SQL Perl
PL/SQL Developer 注册机+汉化包+用户指南
PL/SQL Developer 注册机+汉化包+用户指南
19 0
|
2天前
|
SQL Oracle 关系型数据库
|
2天前
|
SQL Oracle 关系型数据库
Oracle的PL/SQL游标属性:数据的“导航仪”与“仪表盘”
【4月更文挑战第19天】Oracle PL/SQL游标属性如同车辆的导航仪和仪表盘,提供丰富信息和控制。 `%FOUND`和`%NOTFOUND`指示数据读取状态,`%ROWCOUNT`记录处理行数,`%ISOPEN`显示游标状态。还有`%BULK_ROWCOUNT`和`%BULK_EXCEPTIONS`增强处理灵活性。通过实例展示了如何在数据处理中利用这些属性监控和控制流程,提高效率和准确性。掌握游标属性是提升数据处理能力的关键。
|
2天前
|
SQL Oracle 安全
Oracle的PL/SQL循环语句:数据的“旋转木马”与“无限之旅”
【4月更文挑战第19天】Oracle PL/SQL中的循环语句(LOOP、EXIT WHEN、FOR、WHILE)是处理数据的关键工具,用于批量操作、报表生成和复杂业务逻辑。LOOP提供无限循环,可通过EXIT WHEN设定退出条件;FOR循环适用于固定次数迭代,WHILE循环基于条件判断执行。有效使用循环能提高效率,但需注意避免无限循环和优化大数据处理性能。掌握循环语句,将使数据处理更加高效和便捷。
|
2天前
|
SQL Oracle 关系型数据库
Oracle的PL/SQL条件控制:数据的“红绿灯”与“分岔路”
【4月更文挑战第19天】在Oracle PL/SQL中,IF语句与CASE语句扮演着数据流程控制的关键角色。IF语句如红绿灯,依据条件决定程序执行路径;ELSE和ELSIF提供多分支逻辑。CASE语句则是分岔路,按表达式值选择执行路径。这些条件控制语句在数据验证、错误处理和业务逻辑中不可或缺,通过巧妙运用能实现高效程序逻辑,保障数据正确流转,支持企业业务发展。理解并熟练掌握这些语句的使用是成为合格数据管理员的重要一环。
|
2天前
|
SQL Oracle 关系型数据库
Oracle的PL/SQL表达式:数据的魔法公式
【4月更文挑战第19天】探索Oracle PL/SQL表达式,体验数据的魔法公式。表达式结合常量、变量、运算符和函数,用于数据运算与转换。算术运算符处理数值计算,比较运算符执行数据比较,内置函数如TO_CHAR、ROUND和SUBSTR提供多样化操作。条件表达式如CASE和NULLIF实现灵活逻辑判断。广泛应用于SQL查询和PL/SQL程序,助你驾驭数据,揭示其背后的规律与秘密,成为数据魔法师。
|
2天前
|
SQL Oracle 关系型数据库
Oracle系列十一:PL/SQL
Oracle系列十一:PL/SQL