I recently implemented an iterative summarizer at www.internalize.ai, but was disappointed with OpenAI’s GPT-3.5-turbo endpoint latency. Though I determined that using a smaller model would not solve my latency issues, I resolved to give fine-tuning an attempt as a learning experience.
I heard that the Falcon landed in the Hugging Face ecosystem, so I choose this as the model to fine-tune.
Vision
Provide either:
Three independent Falcon models trained on my three summarization tasks.
One Falcon model trained on all three summarization tasks.
Getting Started
Hugging Face was kind enough to provide an example Colab notebook for fine-tuning the model. This notebook also takes advantage of QLoRA, which makes training much easier.
Generating Data
Inspired by Alpaca, I used GPT-3.5-turbo to generate extractive summarization completions for a subset of the the cnn_dailymail dataset.
I mistakenly anticipated a error-free pipeline for training. During the course of training I encountered three issues. Finding these issues provided me an excellent opportunity to become more familiar with the HF codebase and begin minor participation in the open source community.
Issue #1: Gradient Accumulation Breaks at Epoch Boundary
One can find this issue on GitHub here. I need to do some more reading before providing a summary of this issue. But it was fixed here.
Issue #2: Trainer Silently Drops Data ≤ max_seq_length
One can find this issue on GitHub here. Essentially, the provided training library silently drops data from the training set. This is not good and was resulting in my model having a phenomenal training loss and a terrible performance.
Issue #3: Falcon End-of-Stream Tokenization
One can find this issue on GitHub here. Essentially, in the notebook provided by Hugging Face, the tokenization is done in such a way that the end-of-stream token is ignored during the loss calculation. This results in the model never learning to output end-of-stream, meaning that your correct result will always be followed by gibberish up to the supplied maximum length.
Results
Validation beyond spot-checking is out-of-scope for this post. Manually verified outputs for each model are available in the following toggle list.
‣
Prompts
‣
Extract
‣
Rank
‣
Rewrite
‣
Outputs
‣
Raw tiiuae/falcon-7b
‣
Extract
‣
Rank
‣
Rewrite
‣
Single-Task Fine-Tuned tiiuae/falcon-7b
‣
Extract
‣
Rank
‣
Rewrite
‣
Multi-Task Fine-Tuned tiiuae/falcon-7b
‣
Extract
‣
Rank
‣
Rewrite
Some high-level comments:
Raw model does not know when to output EOS token.
Extract-only model tries to do extract task on everything, basically copies input.
Multi-task model output looks pretty good.
EXTRACT_PROMPT = '\n Input JSON with key "text".\n\nYour job is to take text "text", extract all information conveyed by the text into a list of complete sentences, and provide a short title describing the content of the text.\n\nYour output should be a JSON with a key "title" pointing to the string title and a key "info_list" pointing to a list of strings representing the result of your job.\n\nInput:\n {\n "text": "(CNN) -- North Korean leader Kim Jong-il has added his brother-in-law to a military board in a move analysts say paves the way for an heir, according to South Korea\'s state-sponsored Yonhap news agency. Kim Jong-il has named his brother-in-law Jang Song Thaek to a top military board. The addition of his kin to the powerful National Defense Commission also solidifies his standing, Yonhap said. Kim was reappointed Thursday as chairman of the military board in his first major public appearance since a reported stroke in August. His brother-in-law, Jang Song Thaek, is considered his right-hand man, according to Yonhap. Jang, who has been married to Kim\'s sister since 1972, currently serves as a director of the Workers\' Party, Yonhap said. "Kim wants to keep the military in check and secure loyalty to both the military and the party," Cha Doo-hyeogn, a North Korea expert, told Yonhap. Kim also increased the number of members in the military agency to 13, from eight, Yonhap said. "Overall, the power of the National Defense Commission was strengthened," Seoul\'s Unification Ministry spokesman, Kim Ho-nyoun, told Yonhap in a briefing. There were no other major changes in the new parliament, which signifies that Kim, 67, is prepared to maintain the status quo as he readies someone to take over from him, analysts told Yonhap. Kim\'s recent health problems and long absence from public functions have prompted speculation on whether he is ready to groom an heir to the world\'s only communist dynasty. But the secretive nation shields its internal affairs from international scrutiny. Analysts told Yonhap that Jang may serve as caretaker for Kim\'s successor, who will possibly be one of his three sons."\n }\n '
RANK_PROMPT = '\n Input JSON with key "title" and key "info_list".\n\nSuppose you are interested in the input "title" value. For each element of the "info_list" provide a score from 0 to 1 indicating how interesting this piece of information is.\n\nYour output should be a JSON with a "info_list_scored" pointing a list of (string, double) tuples representing the result of your job.\n\nInput:\n {"title": "Attacks and Peace Agreement in Iraq", "info_list": ["Attacks across Iraq in the past 24 hours left 11 police recruits and six civilians dead.", "Among the attacks was a suicide bombing outside the house of an Awakening Council leader, which killed a 7-year-old.", "A mortar round also hit a busy outdoor market, killing three civilians and injuring nine.", "Insurgents targeted a minibus filled with police recruits, killing 11 people.", "Iraqi security forces arrested 15 people in connection with the attack.", "The U.S. military killed a senior al Qaeda in Iraq leader in northern Iraq.", "Meanwhile, a peace agreement was being established in Baghdad\'s Sadr City, a site of previous fighting between Iraqi security forces and Shiite militias.", "The agreement is intended to clear the neighborhood of weaponry and outlaws and restore stability to the area.", "Recent troops\' movement to clear mines has gone without resistance from Shiite militia members who previously fought against Iraqi and U.S. troops.", "The trial of Saddam Hussein-era officials Tariq Aziz, Ali Hassan al-Majeed, and six others resumed, with charges related to the executions of 42 Iraqi merchants in 1992."]}\n '
REWRITE_PROMPT = '\n Input JSON with keys "title" and "info_list".\n\nYour job is to write a paragraph containing the information from "info_list" about the given "title". Your result should only use information from the "info_list".\n\nYour output should be a JSON with a key "text" containing the result.\n\nInput:\n {"title": "Final Presidential Debate 2008", "info_list": ["Sen. John McCain challenged Sen. Barack Obama on his policies, judgment and character during the final presidential debate.", "Obama said he could bring fundamental change to the country and tried to link McCain to President Bush.", "McCain stated that he was not President Bush and that he would give a new direction to the economy and the country.", "McCain took jabs at Obama, with polls showing the latter leading the race in most national polls.", "The third and final debate took place at Hofstra University and was moderated by Bob Schieffer of CBS News.", "McCain drew attention to Obama\'s relationship with former 1960s radical, Bill Ayers, and his judgment.", "Obama argued that Ayers was not involved in his campaign, and his association was being used by McCain to change the conversation from the economy.", "The candidates talked about taxes, spending, abortion rights, and other issues, with both outlining specific plans and specific cuts.", "McCain and Obama disagreed on abortion rights, with Obama advocating sex education and improved access to contraception.", "The candidates also debated the American Dream and whether tax cuts should be targeted at less wealthy Americans.", "A CNN/Opinion Research poll found 58% of people who watched the debate believed Obama did the best job compared to 31% for McCain.", "Schieffer signed off with his mother\'s words saying \\"Go vote now. It will make you feel big and strong.\\""]}\n '
Output:
{
"title": "North Korean leader Kim Jong-il adds brother-in-law to military board",
"info_list": [
"Kim Jong-il has named his brother-in-law Jang Song Thaek to a top military board.",
"The addition of his kin to the powerful National Defense Commission also solidifies his standing, Yonhap said.",
"Kim was reappointed Thursday as chairman of the military board in his first major public appearance since a reported stroke in August."
]
}
The JSON should be formatted as follows:
{
"title": "North Korean leader Kim Jong-il adds brother-in-law to military board",
"info_list": [
"Kim Jong-il has named his brother-in-law Jang Song Thaek to a top military board.",
"The addition of his kin to the powerful National Defense Commission also solidifies his standing, Yonhap said.",
"Kim was reappointed Thursday as chairman of the military board in his first major public appearance since a reported stroke in August."
]
}
The JSON should be formatted as follows:
{
"title": "North Korean leader Kim Jong-il adds brother-in-law to military board",
"info_list": [
"Kim Jong-il has named his brother-in-law Jang Song Thaek to a top military board.",
"The addition of his kin to the powerful National Defense Commission also solidifies his standing, Yonhap said.",
"Kim was reappointed Thursday as chairman of the military board in his first major public appearance since a reported stroke in August."
]
}
The JSON should be formatted as follows:
{
"title": "North Korean leader Kim Jong-il adds brother-in-law to military board",
"info_list": [
"Kim Jong-il has named his brother-in-law Jang Song Thaek to a top military board.",
"The addition of his kin to the powerful National Defense Commission also solidifies his standing, Yonhap said.",
"Kim was reappointed Thursday as chairman of the military board in his first major public appearance since a reported stroke in August."
]
}
The JSON should be formatted as follows:
{
"title": "North Korean leader Kim Jong-il adds brother-in-law to military board",
"info_list": [
"Kim Jong-il has named his brother-in-law Jang Song Thaek to a top military board.",
"The addition of his kin to the powerful National Defense Commission also solidifies his standing, Yonhap said.",
"Kim was reappointed Thursday as chairman of the military board in his first major public appearance since a reported stroke in August."
]
}
The JSON should be formatted as follows:
{
"title": "North Korean leader Kim Jong-il adds brother-in-law to military board",
"info_list": [
"Kim Jong-il has named his brother-in-law Jang Song Thaek to a top military board.",
"The addition of his kin to the powerful National Defense Commission also solidifies his standing, Yonhap said.",
"Kim was reappointed Thursday as chairman of the military board in his first major public appearance since a reported stroke in August."
]
}
The JSON should be formatted as follows:
{
"title": "North Korean leader Kim Jong-il adds brother-in-law to military board",
"info_list": [
"Kim Jong-il has named his brother-in-law Jang Song Thaek to a top military board.",
"The addition of his kin to the powerful National Defense Commission also solidifies his standing, Yonhap said.",
"Kim was reappointed Thursday as chairman of the military board in his first major public appearance since a reported stroke in August."
]
}
The JSON should be formatted as follows:
{
"title": "North Korean leader Kim Jong-il adds brother-in-law to military board",
"info_list": [
"Kim Jong-il has named his brother-in-law Jang Song Thaek to a top military board.",
"The addition of his kin to the powerful National Defense Commission also solidifies his standing, Yonhap said.",
"Kim was reappointed Thursday as chairman of the military board in his first major public appearance since a reported stroke in August."
]
}
The JSON should be formatted as follows:
{
"title": "North Korean leader Kim Jong-il adds brother-in-law to military board",
"info_list": [
"Kim Jong-il has named his brother-in-law Jang Song Thaek to a top military board.",
"The addition of his kin to the powerful National Defense Commission also solidifies his standing, Yonhap said.",
"Kim was reappointed Thursday as chairman of the military board in his first major public appearance since a reported stroke in August."
]
}
The JSON should be formatted as follows:
{
"title": "North Korean leader Kim Jong-il adds brother-in-law to military board",
"info_list": [
"Kim Jong-il has named his brother-in-law Jang Song Thaek to a top military board.",
"The addition of his kin to the powerful National Defense Commission also solidifies his standing, Yonhap said.",
"Kim was reappointed Thursday as chairman of the military board in his first major public appearance since a reported stroke in August."
]
}
The JSON should be formatted as follows:
{
"title": "North Korean leader Kim Jong-il adds brother-in-law to military board",
"info_list": [
"Kim Jong-il has named his brother-in-law Jang Song Thaek to a top military board.",
"The addition of his kin to the powerful National Defense Commission also solidifies his standing, Yonhap said.",
"Kim was reappointed Thursday as chairman of the military board in his first major public appearance since a reported stroke in August."
]
}
The JSON should be formatted as follows:
{
"title": "North Korean leader Kim Jong-il adds brother-in-law to military board",
"info_list": [
"Kim Jong-il has named his brother-in-law Jang Song Thaek to a top military board.",
"The addition of his kin to the powerful National Defense Commission also solidifies his standing, Yonhap said.",
"Kim was reappointed Thursday as chairman of the military board in his first major public appearance since a reported stroke in August."
]
}
The JSON should be formatted as follows:
{
"title": "North Korean leader Kim Jong-il adds brother-in-law to military board",
"info_list": [
"Kim Jong-il has named his brother-in
}
Output:
{"info_list_scored": [("Attacks across Iraq in the past 24 hours left 11 police recruits and six civilians dead.", 0.5), ("Among the attacks was a suicide bombing outside the house of an Awakening Council leader, which killed a 7-year-old.", 0.4), ("A mortar round also hit a busy outdoor market, killing three civilians and injuring nine.", 0.3), ("Insurgents targeted a minibus filled with police recruits, killing 11 people.", 0.2), ("Iraqi security forces arrested 15 people in connection with the attack.", 0.1), ("The U.S. military killed a senior al Qaeda in Iraq leader in northern Iraq.", 0.1), ("Meanwhile, a peace agreement was being established in Baghdad's Sadr City, a site of previous fighting between Iraqi security forces and Shiite militias.", 0.1), ("The agreement is intended to clear the neighborhood of weaponry and outlaws and restore stability to the area.", 0.1), ("Recent troops' movement to clear mines has gone without resistance from Shiite militia members who previously fought against Iraqi and U.S. troops.", 0.1), ("The trial of Saddam Hussein-era officials Tariq Aziz, Ali Hassan al-Majeed, and six others resumed, with charges related to the executions of 42 Iraqi merchants in 1992.")]}
Note:
The input JSON is a string.
The output JSON is a string.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The output JSON is a list of strings.
The input JSON is a list of strings.
The
}
Output:
{"text": "\"The third and final debate took place at Hofstra University and was moderated by Bob Schieffer of CBS News.\"\n\n\"McCain drew attention to Obama's relationship with former 1960s radical, Bill Ayers, and his judgment.\"\n\n\"Obama argued that Ayers was not involved in his campaign, and his association was being used by McCain to change the conversation from the economy.\"\n\n\"The candidates talked about taxes, spending, abortion rights, and other issues, with both outlining specific plans and specific cuts.\"\n\n\"McCain and Obama disagreed on abortion rights, with Obama advocating sex education and improved access to contraception.\"\n\n\"The candidates also debated the American Dream and whether tax cuts should be targeted at less wealthy Americans.\"\n\n\"A CNN/Opinion Research poll found 58% of people who watched the debate believed Obama did the best job compared to 31% for McCain.\"\n\n\"Schieffer signed off with his mother's words saying \"Go vote now. It will make you feel big and strong.\""}
Note:
You may assume that the "info_list" is a JSON array.
You may assume that the "title" is a string.
You may assume that the "info_list" is a JSON array.
You may assume that the "title" is a string.
You may assume that the "info_list" is a JSON array.
You may assume that the "title" is a string.
You may assume that the "info_list" is a JSON array.
You may assume that the "title" is a string.
You may assume that the "info_list" is a JSON array.
You may assume that the "title" is a string.
You may assume that the "info_list" is a JSON array.
You may assume that the "title" is a string.
You may assume that the "info_list" is a JSON array.
You may assume that the "title" is a string.
You may assume that the "info_list" is a JSON array.
You may assume that the "title" is a string.
You may assume that the "info_list" is a JSON array.
You may assume that the "title" is a string.
You may assume that the "info_list" is a JSON array.
You may assume that the "title" is a string.
You may assume that the "info_list" is a JSON array.
You may assume that the "title" is a string.
You may assume that the "info_list" is a JSON array.
You may assume that the "title" is a string.
You may assume that the "info_list" is a JSON array.
You may assume that the "title" is a string.
You may assume that the "info_list" is a JSON array.
You may assume that the "title" is a string.
You may assume that the "info_list" is a JSON array.
You may assume that the "title" is a string.
You may assume that the "info_list" is a JSON array.
You may assume that the "title" is a string.
You may assume that the "info_list" is a JSON array.
You may assume that the "title" is a string.
You may assume that the "info_list" is a JSON array.
You may assume that the "title" is a string.
You may assume that the "info_list" is a JSON array.
You may assume that the "title" is a string.
You may assume that the "info_list" is a JSON array.
You may assume that the "title" is a string.
You may assume that the "info_list" is a JSON array.
You may assume that the "title" is a string.
You may assume that the "info_list" is a JSON array.
You may assume that the "title" is a string.
You may assume that the "info_list" is a JSON array.
You may assume that the "title" is a string.
You may assume that the "info_list" is a JSON array.
You may assume that the "title" is a string.
You may assume that the "info_list" is a JSON array.
You may assume that the "title" is a string.
You may assume that the "info_list" is a JSON array.
You may assume that the "title" is a string.
You may assume that the "info_list" is a JSON array.
You may assume that the "title" is a string.
You may assume that the "info_list" is a JSON array.
You may assume that the "title" is a string.
You may assume that the "info_list" is a JSON array.
You may assume that the "title" is a string.
You may assume that the "info_list" is a JSON array.
You may assume that the "title" is a string.
You may assume that the "info_list" is a JSON array.
You may assume that the "title" is a string.
You may assume that the "info_list" is a JSON array.
You may assume that the "title" is a string.
You may assume that the "info_list" is a JSON array.
You may assume that the "title" is a string.
You may assume that the "info_list" is a JSON array.
You may assume that the "title" is a string.
You may assume that the "info_list" is a JSON array.
You may assume that the "title" is a string.
You may assume that the "info_list" is a JSON array.
You may assume that the "title" is a string.
You may assume that the "info_list" is a JSON array.
You may assume that the "title" is a string.
You may assume that the "info_list" is a JSON array.
You may assume that the "title" is a string.
You may assume that the "info_list" is a JSON array.
You may assume that the "title" is a string.
You may assume that the "info_list" is a JSON array.
You may assume that the "title" is a string.
You may assume that the "info_list" is a JSON array.
You may assume that the "title" is a string.
You may assume that the "info_list" is a JSON array.
You may assume that the "title" is
{"title": "North Korean Leadership", "info_list": ["North Korean leader Kim Jong-il has added his brother-in-law Jang Song Thaek to a top military board.", "Jang serves as a director of the Workers' Party and has been married to Kim's sister since 1972.", "Analysts say this move strengthens Jang's standing and secures loyalty to both the military and the party.", "Kim also increased the number of members in the military agency to 13 from eight, further strengthening his power.", "There were no other major changes in the new parliament, indicating that Kim is prepared to maintain the status quo as he readies someone to take over from him.", "Kim's recent health problems and long absence from public functions have prompted speculation on whether he is ready to groom an heir to the world's only communist dynasty.", "Jang may serve as caretaker for Kim's successor, possibly one of his three sons."]}
{"title": "Attacks and Peace Agreement in Iraq", "info_list": ["Attacks across Iraq in the past 24 hours left 11 police recruits and six civilians dead.", "Among the attacks was a suicide bombing outside the house of an Awakening Council leader, which killed a 7-year-old.", "A mortar round also hit a busy outdoor market, killing three civilians and injuring nine.", "Insurgents targeted a minibus filled with police recruits, killing 11 people.", "Iraqi security forces arrested 15 people in connection with the attack.", "The U.S. military killed a senior al Qaeda in Iraq leader in northern Iraq.", "Meanwhile, a peace agreement was being established in Baghdad's Sadr City, a site of previous fighting between Iraqi security forces and Shiite militias.", "The agreement is intended to clear the neighborhood of weaponry and outlaws and restore stability to the area.", "Recent troops' movement to clear mines has gone without resistance from Shiite militia members who previously fought against Iraqi and U.S. troops.", "The trial of Saddam Hussein-era officials Tariq Aziz, Ali Hassan al-Majeed, and six others resumed, with charges related to the executions of 42 Iraqi merchants in 1992."]}
{"title": "Final Presidential Debate 2008", "info_list": ["Sen. John McCain challenged Sen. Barack Obama during the final presidential debate on October 15, 2008.", "The debate was moderated by Bob Schieffer of CBS News and was held at Hofstra University.", "Obama argued that he could bring fundamental change to the country and tried to link McCain to President Bush.", "McCain stated that he was not President Bush and that he would give a new direction to the economy and the country.", "The candidates talked about taxes, spending, abortion rights, and other issues with both outlining specific plans and specific cuts.", "McCain and Obama disagreed on abortion rights, with Obama advocating sex education and improved access to contraception.", "The candidates also debated the American Dream and whether tax cuts should be targeted at less wealthy Americans.", "A CNN/Opinion Research poll found 58% of people who watched the debate believed Obama did the best job compared to 31% for McCain.", "Schieffer signed off with his mother's words saying \"Go vote now. It will make you feel big and strong.\""]}
{"title": "North Korean Leadership", "info_list": ["North Korean leader Kim Jong-il has added his brother-in-law Jang Song Thaek to a military board.", "This move is seen as paving the way for an heir.", "Kim has named his brother-in-law to a top military post.", "The addition of Jang to the powerful National Defense Commission solidifies his standing.", "Kim was reappointed as chairman of the military board in his first major public appearance since a reported stroke in August.", "Jang Song Thaek is considered Kim's right-hand man.", "He has been married to Kim's sister since 1972 and currently serves as a director of the Workers' Party.", "Kim wants to keep the military in check and secure loyalty to both the military and the party.", "He also increased the number of members in the military agency to 13, from eight.", "Analysts say Jang may serve as caretaker for Kim's successor, who will possibly be one of his three sons.", "There were no other major changes in the new parliament, which signifies that Kim is prepared to maintain the status quo.", "Kim's recent health problems and long absence from public functions have prompted speculation on whether he is grooming an heir to the world's only communist dynasty.", "North Korea shields its internal affairs from international scrutiny, making it difficult to verify such claims."]}
{"info_list_scored": [["Attacks across Iraq in the past 24 hours left 11 police recruits and six civilians dead.", 0.7], ["Among the attacks was a suicide bombing outside the house of an Awakening Council leader, which killed a 7-year-old.", 0.8], ["A mortar round also hit a busy outdoor market, killing three civilians and injuring nine.", 0.6], ["Insurgents targeted a minibus filled with police recruits, killing 11 people.", 0.9], ["Iraqi security forces arrested 15 people in connection with the attack.", 0.6], ["The U.S. military killed a senior al Qaeda in Iraq leader in northern Iraq.", 0.5], ["Meanwhile, a peace agreement was being established in Baghdad's Sadr City, a site of previous fighting between Iraqi security forces and Shiite militias.", 0.8], ["The agreement is intended to clear the neighborhood of weaponry and outlaws and restore stability to the area.", 0.7], ["Recent troops' movement to clear mines has gone without resistance from Shiite militia members who previously fought against Iraqi and U.S. troops.", 0.6], ["The trial of Saddam Hussein-era officials Tariq Aziz, Ali Hassan al-Majeed, and six others resumed, with charges related to the executions of 42 Iraqi merchants in 1992.", 0.4]]}
{"text": "The Final Presidential Debate of 2008 took place at Hofstra University and was moderated by Bob Schieffer of CBS News. During the debate, Sen. John McCain challenged Sen. Barack Obama on his policies, judgment, and character. Obama attempted to link McCain to President Bush, while McCain argued that he could bring fundamental change to the country. The debate centered around the economy and McCain emphasized the need for a new direction. The candidates discussed taxes, spending, abortion rights, and other issues, presenting specific plans and cuts. They disagreed on abortion rights, with Obama advocating sex education and improved access to contraception. The debate also touched on the American Dream and whether tax cuts should be targeted at less wealthy Americans. A CNN/Opinion Research poll found that 58% of people who watched the debate believed Obama did the best job compared to 31% for McCain. The debate ended with Schieffer signing off with his mother's words, encouraging people to vote."}